Probabilistic Surface Layer Fatigue Strength Assessment of EN AC-46200 Sand Castings

The local fatigue strength within the aluminium cast surface layer is affected strongly by surface layer porosity and cast surface texture based notches. This article perpetuates the scientific methodology of a previously published fatigue assessment model of sand cast aluminium surface layers in T6 heat treatment condition. A new sampling position with significantly different surface roughness is investigated and the model exponents a 1 and a 2 are re-parametrised to be suited for a significantly increased range of surface roughness values. Furthermore, the fatigue assessment model of specimens in hot isostatic pressing (HIP) heat treatment condition is studied for all sampling positions. The obtained long life fatigue strength results are approximately 6% to 9% conservative, thus proven valid within an range of 30 µm ≤ S v ≤ 260 µm notch valley depth. To enhance engineering feasibility even further, the local concept is extended by a probabilistic approach invoking extreme value statistics. A bivariate distribution enables an advanced probabilistic long life fatigue strength of cast surface textures, based on statistically derived parameters such as extremal valley depth S v i and equivalent notch root radius ρ ¯ i . Summing up, a statistically driven fatigue strength assessment tool of sand cast aluminium surfaces has been developed and features an engineering friendly design method.


Introduction
For fatigue strength assessment of metallic castings in mechanical engineering the designer has to consider a manufacturing process based on local material properties such as shrinkage pores or surface texture based notches. Neglecting the effect of defects on fatigue strength will result in oversizing of mechanical components to maintain globally sufficient component safety. As nowadays lightweight construction demands and sustainable designs are encouraged, aluminium sand cast components are often utilised which enable complex geometries and thus support significant weight savings of up to 50% [1]. However, it is well known that aluminium castings inherit both internal casting defects, particularly shrinkage pores, as well as surface texture related micro and macro notches driven by the surface geometrical structure (SGS), affecting the local fatigue strength. Therefore, applicable assessment methodologies, considering these local influences, are an advantageous tool in fatigue design.  This methodology may be validated even further by additional datasets, which should provide significantly different cast surface textures in order to broaden the applicability of the method towards a wider range of surface roughness values. In addition, as this areal sand cast surface characterisation method is based on local roughness values, the fatigue designer has to have knowledge about these manufacturing process dependent values. However, such as localised information is in general not available for cast surface structures. Therefore, this concept is extended by a probabilistic approach as previously recommended in [36]. Moreover, the effect of sub-area size ought to be investigated to cover miscellaneous evaluation area magnitudes as well.
Therefore, this paper scientifically contributes to the following key parts. • Extension of the assessment methodology presented in [36] utilising an additional aluminium sand cast surface exhibiting a significantly varying surface roughness structure. • As-cast surfaces in the T6 heat treatment condition were the main research target in [36]; the applicability of the assessment model to cast specimens with additional hot isostatic pressing (HIP) heat treatment is evaluated. • Robustness study of the presented method in terms of sub-area magnitude or sample size and their effect on the evaluated statistical distribution. • Statistical characterisation of the sand cast surface texture and subsequently probabilistic evaluation of the manufacturing process related surface fatigue strength as design recommendations of cast components.

Investigated Material
The investigated aluminium alloy's EN numerical designation is EN AC-46200. The gravity sand cast components are crankcases, manufactured by means of the core package system (CPS) casting process [40][41][42][43]. For details about the specimen geometry and the nominal chemical composition of the material the authors refer to the work in [36] as reference. As an additional different surface roughness texture is investigated for validation in this study, the specimen series with this new sampling position is denoted by P2, while the original specimen series investigated in [36] are labelled as P1 in the following. Beside the variation of the specimen position the effect of an additional HIP heat treatment (HIP+T6), as also studied in [44], is investigated and compared to T6 heat treatment condition. The HIP+T6 specimens are subsequently labelled as HIP. Due to the HIP process, shrinkage pores within the bulk material shall be closed and its effect on the surface layer will be studied. Typically applied HIP parameters for aluminium alloys [44][45][46][47][48], such as temperature T, pressure p and time t, are given in Table 1. Table 1. Typical hot isostatic pressing (HIP) parameters for Al alloys [44][45][46][47][48].

T [ • C] p [MPa] t [h]
510-521 103 [2][3][4][5][6] Metallographic analysis revealed that the HIP specimens do not differ in microstructure in terms of secondary dendrite arm spacing (DAS) in comparison to T6 specimens, thus matching the findings in [49,50]. The DAS has been evaluated as described in [51], and the mean values of sampling position P1 and P2 in T6 as well as HIP heat treatment condition are listed in Table 2. For both, T6 P1 ( Figure 2b) and HIP P1 (Figure 2a) a mean DAS of about 26 µm was evaluated. For specimens at sampling position P2 a slight DAS gradient is observable, see Figure 2c. This is caused by the increased solidification rate within the surface layer at this sampling position. Near the surface a DAS of about 21 µm was measured, which slowly increased to about 28 µm at a distance of 9 mm measured from the cast surface. This matches the results of Aigner [49], investigating the bulk material of EN AC-46200 at sampling position P2. Although in Figure 2c a specimen with T6 heat treatment is presented, the HIP micostructure in terms of DAS is identical as the process only affects the bulk material porosity. However, as DAS does not affect the fatigue strength in the presence of defects according to [6,13], differences in DAS can be neglected.  The tested material properties of both HIP and T6 heat treatment are opposed in Table 3. While the ultimate tensile strength R m as well as Vickers hardness HV10, Young's modulus E and yield strength R P0.2 only differ slightly, the elongation at rupture A is significantly increased, which matches the findings in [20,44].

Experimental
Within this work, fatigue tests were performed on HIP sand cast surfaces at sampling positions P1 and P2. Although three test series have been experimentally investigated (HIP P1, HIP P1(2), HIP P2), only two of them (HIP P1 and HIP P2) will be presented in detail to enhance clarity. However, fatigue test results such as S/N-parameters are evaluated and tabulated for all three investigated series.
All testing series possess cast surfaces and the fatigue tests were performed identically to the procedure described in [36] utilising a Rumul Cracktronic ® . Due to the load stress ratio of R = 0 under bending load the highly tensile-stressed region of the specimen is set to the cast surface layer. The S/N-curves are statistically evaluated following the procedure applied in [36]. Figure 3 depicts the nominal bending S/N-curve of the HIP test series at the new sampling position P2, possessing a significantly reduced surface roughness, as discussed in Section 5. Within all S/N-figures, the stress amplitude σ a is normalised to the material's near defect-free long life fatigue strength σ LLF,0 . As stated in [36], the value of σ LLF,0 was experimentally evaluated by means of HIP specimens with machined and subsequently polished surface condition. Thus, the observed fatigue strength is unaffected both from porosity effect and surface roughness effect. The crack initiation cause is marked red if the crack initiated at a combination of a surface pit and a micropore located directly within the surrounding area (Cast-M), and blue if only the cast surface texture (Cast-S), that is, surface roughness, caused technical crack initiation. Figure 4 exemplary depicts the fractographically evaluated crack initiating defects of Cast-S and Cast-M specimens of the HIP P1 and HIP P2 testing series by means of scanning electron microscopy (SEM). Fractographic images of HIP P1 are also representative for the T6 P1 testing series of [36]. For almost all Cast-M specimens shrinkage porosity was observed to participate in crack initiation. Only few cases revealed gas pores, bifilms or intermetallic phases to be critical. In contrast with the locations of the pores of Cast-M P1 specimens, those of Cast-M P2 rarely have been broached, but were found to be located about 10 µm to 30 µm beneath the surface. This may be caused by the elevated solidification rate within the surface layer at this sampling position. For Cast-n.d. specimens, no distinct crack initiation cause could be clearly determined by fracture surface analysis, which is why they are not subsequently taken into account for validation.
The S/N-curves (Figures 3 and 5) are given with their 90% and 10% probability of survival and the stress scatter index T S,1e7 is calculated according to [52] by means of Equation (2) at ten million load cycles. The evaluated S/N-curve provides in Table 4 the value of the inverse slopes k 1 and k 2 , which is five times k 1 [53], the transition knee point N T and the normalised long life fatigue strength σ a,Ps50 as well as the stress scatter index T S,1e7 . Figure 5 depicts the evaluated S/N-curve of the HIP specimen series at sampling position P1, which is similar to the original sampling position presented in [36]. The coloration of the markers is the same as in Figure 3. The accompanying fracture surface analysis revealed great similarity to those samples of the T6 P1 testing series in [36]. As the evaluation of the long life fatigue strength of the HIP P1 specimen series by means of the arcsin √ p method [54] would lead to a smaller scatter within the long life region compared to the finite life region, the normalisation process of S/N-curves was applied as proposed in [52,55], and also accordingly executed in [36]. The evaluated S/N-curve results are listed in Table 4 as well. Additionally, within Figure 5 the evaluated long life fatigue strength σ a,Ps50 of the T6 testing series with cast surface, as sketched in [36], is highlighted by the purple dash-dotted line for comparison.   At this point, it must be stated that the fact that Cast-M specimen fatigue results are similar to Cast-S specimens does not imply that surface layer porosity can be neglected. Both surface layer porosity and the cast surface texture showed similar fatigue results if the are evaluated independently. Thus, they both affect the fatigue strength in a comparable manner. However, if the cracks initiate combinatorial, as observed in Cast-M specimens, see Figure 4a,c, they lead to similar fatigue test results even if the defects are smaller. Summing up, smaller surface layer inhomogeneities and less detrimental surface texture combinatorial considered may be more crucial than a single, distinct surface pit. Therefore, they may be treated in a combined manner. For more information regarding the combinatorial failure mechanism see the work in [36].

HIP Effect on the Cast Surface Layer
While the HIP process leads to significantly enhanced fatigue strength results in terms of bulk material testing [45,50,[56][57][58][59], this effect was not present for any of the investigated HIP cast surface series. While for one HIP P1 testing series the evaluated long life fatigue strength σ a,Ps50 was above the T6 value (as presented in Figure 5), the other HIP P1(2) testing series the σ a,Ps50 was slightly lowered, but both within the 90% and 10% stress scatter band of HIPped samples. In terms of Cast-S specimens, it was found that the evaluated surface roughness values, and therefore the estimated fatigue strength by means of the introduced fatigue assessment model, was comparable to those of the cast T6 P1 series. Therefore, it is reasonable that the Cast-S points of the HIP P1 specimens in Figure 5 fit to the cast T6 P1 S/N-curve. However, regarding the Cast-M specimens, one may expect a significantly higher fatigue strength due to closed shrinkage porosity. Investigations on metallographic T6 specimens revealed a higher porosity within the surface layer, especially up to a certain depth, see Figure 6. Almost without exception these were shrinkage pores evolving during the solidification process. The left sub-figure shows the detected micropores, while on the right diagram the evaluated degree of porosity is plotted. By means of a user defined routine, pores are detected on a metallographic specimen, which have been captured by means of a digital optical microscope. The degree of porosity was then evaluated by counting the black pixels of the picture, which have been determined as pores, in relation to the white pixels within the same horizontal line. Subsequently, mean values of the degree of porosity have been calculated within a vertical range of 250 µm. It is clearly recognisable that the highest degree of porosity occurs in about 1 mm to 2 mm depth measured from the cast surface. To achieve information about the spatial distribution of the micropores, the metallographic specimen was subsequently grinded, thereby removing about 100 µm, and subsequently evaluated again. This methodology hast been carried out several times for four T6 specimens, resulting in 140 metallographic analysis in total. It emerged that the trend of degree of porosity, as depicted in Figure 6, is representative for the T6 P1 specimens series. An increased degree of porosity near the cast surface of AlSi castings was also stated by Leitner et al. [60]. This increased porosity formation may be reasoned by the oxide entrainment mechanism as a result of turbulent mould filling, see [61][62][63]. However, a comprehensive insight in the degree of porosity can be more properly evaluated by means of XCT-scans [21,60].
The same investigations and evaluation procedure have been conducted for four HIP specimens, again resulting in 140 metallographic analysis slices. A representative result is depicted in Figure 7. It is clearly recognisable, that the HIP process lead to significantly reduced, or even partially completely suppressed porosity within the bulk material. Only within the first mm in depth, porosity was still observable whereat the HIP process did not close these micropores. As this is within the highly stressed region of the specimen, crack initiation is still caused by those surface roughness and microporosity mixed cases (Cast-M), thus resulting in similar long life fatigue strength values as previously discussed. The comparably high possibility of observing a mixed (Cast-M) defect case becomes visible by a comparison of the cast surfaces in T6 and HIP condition by means of SEM. Figure 8a,b depicts SEM images of the cast T6 surface at sampling position P1. Both sub-figures show that the cast surface is frequently broached by cavities, or shrinkage pores. Those cavities can be found both within surface pits as well as at surface peaks. In Figure 8b, even the dendritic structure of the α-phase is observable. Thus, a relatively high chance of crack initiation occurring at a combination of both surface pits due to the surface roughness and surface layer porosity is present.
In Figure 9, the SEM images of the investigated cast surfaces in HIP condition are illustrated for both sampling position P1 and P2. For cast surface texture comparison purpose, Table 5 lists the mean values Sa mean of the global Sa roughness parameter of all specimens, in respect to the sampling position. Additionally, the 10-90% scatter values are given. The cast surface of P1 ( Figure 9a) is basically identical to the cast surface in T6 heat treatment condition presented in Figure 8a. Especially broached pores are again observable in a similar amount. Those cavities and dendritic canals, created by the solidification process, can reach down to about 1 mm in depth in some cases, as those micropores can not be closed by the HIP process. This matches the statement of Atkinson [56] on surface connected porosity. Thus, it can be stated that if, subsequently to the casting process, the machined surface finish is conducted with the aim of removal of surface near porosity, the process has to cover a certain depth. For the exemplified case of Figure 7, removing only 0.5 mm of the cast surface would lead to a broached pore. As broached pores essentially decrease the local fatigue strength as well, such a machining process would not have the intended favourable fatigue effect and lead to similar fatigue strength results as those including cast surface, as presented in [36]. For the cast surface in HIP condition at sampling position P2, see Figure 9b, no broached pores were recognisable on the surface. This matches the results of the HIP P2 fracture surface analysis, where predominantly surface layer pores have been observed which are not broaching the cast surface, but are located about 10 µm to 30 µm beneath. As already mentioned, this might be caused by the significantly increased solidification rate at this sampling position P2 compared to P1.

Fatigue Assessment Model
This section contributes to the alteration of the fatigue assessment model as originally presented in [36] on HIP surfaces as well as on sampling position P2, which possesses a significantly different surface roughness. Thereby, the model's application range in terms of surface roughness parameter values is studied.

Modification of the Model
First, the local roughness values Sv local and ρ at the crack initiation point as well as the modified parameter Sv rev were evaluated. Originally, Sv rev was introduced as the mean value of crack initiating surface pits and it has been substituted as a statistical parameter, based on the most critical surface pits.
It represents the value with 50% probability of occurrence of the five biggest Sv sub-area valley depth distribution (Sv i GEV), which is referenced in detail within Section 5. It was found, that the in [36] presented exponents of a 1 = 0.6 and a 2 = 2 caused too conservative results for Cast-S specimen failures taken from sampling position P2. Thus, these parameters have been adapted to a 1 = 0.4 and a 2 = 1.8 instead, to improve the range of applicability of the basic assessment concept.
For the Cast-M specimen failures, the introduced neural network (NN) in [36] has been adapted to only four neurons and four input variables. The pore location and elongation parameters e min , e max and α have been replaced by the statistical roughness value Sv rev . Thus, the overall condition of geometry dependent distinct roughness values is now considered by Sv rev . Therefore, only the defect size √ area, the local maximum pit height Sv local , the equivalent notch root radius ρ and the statistical pit depth with 50% probability of occurrence Sv rev act as input variables. The network was further trained by four specimens of the HIP P2 testing series in addition to the twenty-five T6 P1 specimens from [36]. The coefficient of determination for the training set was R 2 = 0.988, resulting in the interaction coefficients ψ, listed in Table 6. Therein, the mean values ψ mean as well as its standard deviation ψ std and the minimum ψ min and maximum ψ max values are listed in detail. The individual interaction coefficient ψ of each Cast-M specimen is subsequently used within Equation (4) for calculation of the mixed fatigue strength reduction factor K f ,m by taking the fatigue strength reduction factor K f ,p of surface layer microporosity and the surface fatigue notch factor K f ,s of surface roughness-based notches as combinatorical defect case into account. K f ,p is calculated by Equation (3), as introduced in [36].

Validation of the Model
After modification of the concept in order to improve the overall performance, the model was validated by means of the HIP P2 Cast-S data as well as the remaining two Cast-M specimens, which have not yet been taken into account for training of the modified neural network. The fatigue assessment result is depicted in Figure 10. The fatigue strength is normalised by the material's near defect-free long life fatigue strength σ LLF,0 , which was evaluated at a load stress ratio of R = 0 under bending load at ten million load cycles. The black dashed line marks the long life fatigue strength with 50% probability of survival, taken from the fatigue testing, see Figure 3. The fatigue strength values for each specimen, received by the fatigue assessment model, are depicted in blue for Cast-S specimens and in red for Cast-M specimens. Additionally, for each specimen, the experimental fatigue test data point is extrapolated to one million load cycles and additional plotted into the Figures 10 and 11 represented by σ a,1e7 . Overall, the fatigue strength assessment is 7% conservative, regarding the fatigue strength σ LLF, * ,Ps50 with a probability of survival of 50% in respect to the experimental fatigue strength result σ a,Ps50 . The evaluated fatigue strength σ LLF, * ,Ps10 with a probability of survival of 10% is 6.1% conservative as well. The stress scatter index T S,1e7 of the model results was evaluated by Equation (2). At this stage, only six specimens from the HIP P2 testing series were available for validation; however, the model lead to sound fatigue results so far for both, Cast-M specimens, as well as Cast-S specimen, where no neural network was involved for prior fatigue strength assessment. The experimental and model-based fatigue strength results of the validation series HIP P2 are compared in Table 7. To prove the fatigue assessment model's applicability to HIP-treated cast surfaces even further, the specimen series HIP P1 was used for additional validation. Figure 11 shows the calculated fatigue assessment results. Again, Cast-S specimens with crack initiation at a surface pit due to the surface roughness are marked in blue, and Cast-M specimens representing a combinatorial defect case with surface roughness and surface layer porosity interaction are marked in red colour. The model's long life fatigue strength at 10%, 50% and 90% probability of survival as well as the evaluated stress scatter index T S,1e7 and the experimental fatigue strength of the associated S/N-curve σ a,Ps50 represented by the dashed black line, are again diagrammed. Both Cast-S as well as Cast-M specimens are well assessed, resulting in an overall 9.3% conservative long life fatigue strength design with a probability of survival of 50%. The stress scatter index increased compared to HIP P2, but still shows sound results as it is below the value of the associated S/N-curve, see Figure 5. An overview of the validation results of the HIP P1 series is also given in Table 7. Table 7 also lists the validation results of the specimens series HIP P1(2) utilising 16 specimens. Further, as the assessment of Cast-S as well as of Cast-M specimens has been adapted, the validation data set of [36] with 14 specimens has been re-evaluated and is also given in Table 7, labelled as T6 P1. Utilising the modified fatigue assessment model, the result of T6 P1 becomes slightly more conservative compared to the results in [36] and the stress scatter index increased. However, the overall applicability in terms of sand cast aluminium surface layers with different heat treatment conditions, possessing cast surface textures and surface layer porosity is confirmed by this comprehensive validations sets. Summing up, the results of the estimated fatigue strength are about 6% to 9% conservative. Therefore, the introduced model supports an engineering feasible local fatigue assessment concept, to assess distinctions in cast surface roughness structures and their effect on cyclic endurance limit.

Probabilistic Fatigue Strength Assessment
The proposed surface fatigue assessment model utilises local roughness values evaluated at crack initiation points, identified by means of fracture surface analyses after fatigue testing. However, in engineering design, no a priori knowledge about surface texture is available, instead probabilistic values of the surface layer act as link to the manufacturing process dependent surface layer properties. Moreover, the random variable Sv rev , a parameter based on the distribution of distinctive Sv sub-area values, is an important factor for an appropriate long life fatigue strength calculation. Sv rev should be evaluated with accurateness, necessitating a statistically based recommendation about the sample size of surface measurements.
Thus, this section contributes to the probabilistic assessment of crack initiating surface roughness pits. Previously conducted experiments on Cast-S specimens revealed that for mostly all cases, crack initiation occurred at one of the five deepest surface pits, respectively, one of the five highest Sv values. However, not strictly the maximum value Sv max of all evaluated sub-areas initiates a crack, as the notch root radius interacts in terms of notch stress effect. Concluding, the authors suggest to consider the five highest Sv values of each investigated surface to be representative for statistical surface roughness effect. Subsequently, these surface valley series of five deepest depths is denoted by Sv i with 1 ≤ i ≤ 5. The following sections discuss the effect of sub-area size, statistical distribution, recommendable sample size of surface measurements and demonstrate finally a fatigue strength assessment based on probabilistic surface values.

Sub-Area Size Effect
First, for statistical characterisation of cast surface textures based on sub-area values, the required sub-area size has to be chosen. The effect of the selectable sub-area size is depicted in Figure 12. It shows the course of the unified mean value of the five deepest surface pits Sv i,mean over the sub-area size. The five deepest surface pits are normalised against the ultimate valley depth of the surface Sv max . Four randomly selected surfaces have been investigated covering both T6 as well as in HIP heat treatment condition, subsequently labelled as specimens 1 to 4. To study the effect of sub-area size, the surface structures are evaluated for the same scope of each specimen. The three pictures (panels (a-c)) within Figure 12 all have the same dimensions and show the same region of the cast surface of specimen 1. In terms of 1 mm × 1 mm sub-area size (Figure 12c), the roughness pit is covered basically by a single patch. As originally published in [36], the Sv value of a sub-area size A sub is evaluated according to Equation (5).
Thus, the five extremal Sv i valley depth values characterise five different surface pits. Comparing the result to the 0.25 mm × 0.25 mm sized sub-area evaluation in Figure 12a, it is recognisable, that at least two or more of the five patches now capture the same surface roughness pit in an adjacent manner. Therefore, no independent statistical description is achieved as the chosen sub-area regions are related. This effect of increasing characterisation of the same surface pit is also indicated within the diagram in Figure 12, as the unified ratio suddenly increases from 0.5 mm sub-area side length to 0.25 mm sub-area side length. Following this trend of continuously decreasing sub-area size, one would end at a ratio of nearly one, when all five sub-areas reflect the deepest spot within the deepest pit of the surface by their value. Of course, this sub-area characterisation also depends on the location of the sub-areas based on the original definition of the surface measurement frame. On the other hand, if the sub-area size is too big, possible crack initiating pits may get neglected, for example, let us assume three critical surface pits are close to each and they may be covered by only one pattern instead. Thus, also not leading to sufficient extreme value characterisation.
Finally, the recommendable sub-area size can also be linked to the sand grain size used in the mould which are typically within the range of 100 µm to 300 µm according to Campbell et al. [40]. As the sand grain sizes, observable in Figures 8 and 9, are up to several hundred µm, a sub-area side length of 0.25 mm would be too small to reliable characterise a single surface pit. Taking these findings into account, the applied sub-area size of 1 mm × 1 mm is an appropriate and recommendable choice for the investigation of the present sand cast surface textures. At this point it should be mentioned that the measured surface should cover a large area of the cast surface to obtain enough sub-area entries reflecting the casting manufacturing process itself. Based on the chosen sub-area size, an amount of at least 100 patches should be evaluated. Within this study, approximately 240 sub-areas are within the investigated cast surface area per cast T6 or HIP-treated specimen.

Distribution Parametrisation
For statistical analysis, fatigue-initiating defects can be characterised by means of an extreme value distribution [64]. In terms of limiting extreme value distributions, originally three types have been defined by Gnedenko [65]: the Gumbel distribution (type 1), the Fréchet distribution (type 2) and the Weibull distribution (type 3). The applicable type is thereby determined by the distribution of the basic population from which the extreme value sample has been taken. Jenkinson [66] introduced the General Extreme Value (GEV) distribution, which covers those three types, and is therefore suitable for extreme value statistics. The formulation of the cumulative distribution function of the GEV is given in Equation (6). Therein, δ is the standard deviation (scale parameter), µ is the mean value (location parameter) and ξ is the shape parameter of the distribution. They are often estimated by means of the maximum likelihood method [67,68]. Based on the value of ξ, the type of the distribution is assigned, as the most appropriate GEV type is given by the data itself. Thus, the GEV distribution is frequently used to statistically describe the crack initiating extremal defect size [7,8,14].
According, the probability of occurence of an assessment value greater than a chosen threshold value x. is denoted as.
As outlined before, the five highest surface pit depth values Sv i are well suited for parametrisation of a GEV distribution. As both sampling positions as well as the heat treatment of the surface vary, GEV parameters are evaluated to characterise each manufacturing process-based surface texture. Within Figure 13, three examples of GEV distributions are depicted. The green dash-dotted line symbols the GEV distribution of the T6 P1 validation series, whereas the other two HIP series are marked as yellow dashed line for sampling position P1 and as black continuous line for sampling position P2. Additionally, the Sv i values of each evaluated surface are plotted within the diagram. The HIP P2 series showed the lowest extremal values Sv i . However, comparing the GEV distributions of the HIP P1 series and the T6 P1 series at the same sampling position, the HIP P1 series exhibited higher values of Sv i instead. It was observed that the HIP post treatment may affect the extremal Sv i distribution but keeps the basic population mostly unchanged. It should be noted that the population itself is dependent from the local casting condition and thus no general course of surface valley depth is feasible, but the extremal values can be well parametrised to reflect the local casting process. Table 8 lists the evaluated distribution parameters of the Sv i GEV distribution as well as the statistical parameter Sv rev . The value Sv rev is based on the associated Sv i GEV distribution and is calculated as the value with 50% probability of occurrence (Sv rev = Sv i (P = 50%)). This characteristic value is subsequently used in the derived fatigue strength model and characterises the extremal surface pits in a probabilistic manner. Moreover, for a probabilistic fatigue assessment, the distribution of the equivalent notch root radius ρ has to be evaluated as well. The ρ values are taken from the identical sub-areas as the Sv i values, thus characterising the notch root radius of the five deepest, most critical surface pits, and thus subsequently denoted by ρ i . In order to check for a linear dependency of the population of Sv i and associated ρ i , the coefficient of determination R 2 , which is the squared Pearson correlation coefficient (SPCC) [69], was calculated as a measure for the strength of an assumed linear relationship. It is defined as the ratio of the covariance of two random variables A and B to their standard deviations S A and S B , see Equation (8).
The results in a coefficient of R 2 (Sv i ,ρ i ) = 0.01, which deduces, almost no dependency between Sv i and ρ i . Thus, they are treated as two independent random variables. The evaluated GEV distributions of ρ i are diagrammed in Figure 14 and the fitted distribution parameters are also listed in Table 8. As, in terms of targeted fatigue strength, small notch root radii values are more crucial, the probability of occurrence has to be plotted inversely. Thereby, the T6 P1 series shows the lowest, respectively most critical notch root radii, while the HIP P2 series seems to possess mostly shallow notch curvatures.

Impact of Sample Size
For engineering feasibility it is essential for the design engineer to know how many surfaces should be assessed in order to receive statistically reliable information about the cast surface texture.
To evaluate an adequate sample size of surface measurements for an assumed basic population, the following methodical procedure is suggested by the authors. The overall workflow is depicted in Figure 15, exemplified for the Sv i GEV distribution by means of the T6 P1 specimen series. The Sv i GEV distribution of the T6 P1 specimen series, presented in Figure 13, has been already obtained by means of 34 cast surfaces resulting in 170 Sv i values in total (five values for each specimen). The evaluated GEV distribution parameters ζ, µ and δ in Table 8 are subsequently treated as main population parameters and support the generation of synthetic, random sample sizes. The dataset S1 acts as reference set as it is based on the original distribution parameters, while the set S2 is randomly derived. Both datasets are parametrised as GEV distributions, implying a stepwise evaluation of probability of 0.5%, leading to 200 equally distanced values. The two datasets S1 and S2 are assessed by means of the coefficient of determination R 2 .
Exemplary, let the synthetic sample size be one, and thus five random Sv i values will be generated. This synthetically generated random values simulate new samples and thus are applicable for comparison. In the next step, the five random Sv i values are fitted by a GEV distribution resulting in another ζ 2 , µ 2 and δ 2 of the synthetic sample set. Based on that distribution, the synthetic dataset S2 is computed by calculation of n = 200 Sv i values at equally distanced, stepwise (0.5% per step) increased probability of occurrence (0.5% ≤ P ≤ 99.5%). Finally, the two datasets S1 and S2 can be opposed and assessed by the value of R 2 (S1, S2). If, in this exemplary case, the GEV distributions of S1 (based on the originally evaluated basic population inheriting a quantity of 34 samples) and S2 (based on only one sample) would match, the coefficient of determination would be one. To obtain statistically reliable correlation measures, the procedure of randomly calculating and subsequent evaluation of S2 is repeated several times. In detail, this R 2 (S1, S2) evaluation procedure has been conducted 100 times for sample size one before the sample size is stepwise increased as well. The result of the sample size effect is depicted in Figure 16a for the distribution of Sv i of the T6 P1 specimen series. Therein, the mean values of the coefficient of determination R 2 mean (S1, S2) are given for each sample size. Furthermore, the area of R 2 (S1, S2) values with a probability of occurrence higher than 10% is highlighted by the red area. Once the user-defined criteria (R 2 (S1, S2) with P ≥ 0.1)≥ 0.99 is fulfilled, a satisfying correlation between the sets S1 and S2, respectively, between the GEV distributions of the original and the synthetic samples, is achieved. This area is marked in grey within Figure 16a and was reached for sample size of nineteen in this case. It is clearly visible that the scatter of the R 2 (S1, S2) value decreases as the sample size increases. Thus, the original sample size of 34 investigated specimens has most likely already lead to a basically stable distribution.
The confidence interval of the distribution can be evaluated as well as a measure for change in mean Sv i values. As for probabilistic fatigue assessment, both Sv rev and Sv i rely on the distribution, a tight confidence interval has to be aspired. As Sv rev is defined as Sv i (P = 0.5), the 80% confidence interval at P = 0.5 has been studied. For each sample size the evaluation loop of 100 repetitions lead to a certain scattering of the Sv i values. Figure 16b shows the confidence intervals behaviour of the distributions based on the synthetic sample sets. The mean values of the upper and lower Sv i confidence bounds are marked as red triangles per sample size, as well as their overall 10-90% Sv i (P = 0.5) area bordered by the black dotted line. The Sv rev value of 114, as listed in Table 8, is represented by the bold continuous black line. The evaluated sample size threshold from Figure 16a is also drawn, as well as the grey highlighted area whereat the user defined criterion is fulfilled. This indicates that, at the current sample size threshold, the Sv rev value is within 100 µm ≤ Sv rev ≤ 128 µm. This refers to a scatter index of ± 12% for sample size nineteen, respectively T Sv i = 1 : 1.28. If a tighter confidence interval is aspired, the sample size of surface measurements has to be increased.

Fatigue Strength Assessment
Utilising both the Sv i GEV distribution of the roughness parameter Sv and the ρ i GEV distribution of the notch root radius, a bivariate distribution can be evaluated. As it has been proven that Sv i and ρ i can be handled as independent random variables, the combined probability of occurrence P(Sv i ≥ y, ρ i ≤ x) can be calculated by multiplication of the two single probability functions following Equation (9).
The bivariate cumulative distribution function is exemplary diagrammed in Figure 17 for the T6 P1 specimen series. Figures A1-A3, depicting the bivariate cumulative distribution function of the other investigated specimens series, are added in the Appendix A. The projected GEV distributions of Sv i and ρ i are additionally plotted as red lines and the surface mesh color depends on the value of P(Sv i ≥ y, ρ i ≤ x). The available Cast-S specimens of this series are marked in blue. Concluding, the surface fatigue assessment model can be applied utilising this probabilistic surface texture values. The result is the probabilistic cast surface long life fatigue strength σ LLF,s (P(Sv i ≥ y, ρ i ≤ x)), which is normalised to the near defect-free long life fatigue strength σ LLF,0 , as diagrammed in Figure 18. The mesh colour again highlights the combined probability of occurrence P(Sv i ≥ y, ρ i ≤ x). The 3D-view is additionally depicted from all three axis projections (Figure 18a-c) to show the model's fatigue life dependency. By comparison of the ρ i -plot (Figure 18a) with the Sv i -plot (Figure 18b) it is recognisable that, although both parameters do have an effect on the fatigue strength result, the notch depth, respectively, surface pit depth Sv, is more pronounced.
This probabilistic fatigue assessment procedure facilitates an engineering feasible fatigue design by providing reliable calculation of the long life fatigue strength, utilising the probability of occurrence of the two statistical model parameters Sv i and ρ i . Based on the fatigue design safety requirements, the designer can obtain the long life fatigue strength by comparably low effort in surface measurements and surface texture evaluation.

Discussion
The presented fatigue strength assessment approach, reparametrising the model introduced by Pomberger et al. [36] for sand cast aluminium surface layers with T6 heat treatment, has been applied on two HIP-treated testing series at similar sampling positions (HIP P1 and HIP P1 (2)). Furthermore, another cast surface with significantly reduced surface roughness has been investigated (HIP P2). By means of the reparametrised exponents a 1 and a 2 , the HIPped validation specimen series, refer to Table 7, as well as the validation series from [36] (T6 P1) showed sound results in terms of long life fatigue strength estimation. Due to the diversification of measured surface roughness data, this concept is now valid to a wider range of cast surfaces. Moreover, current research investigations apply the introduced method also on additively manufactures surface textures.
The HIP-treated cast surfaces revealed that the surface layer pores have partly not being closed. This is caused by the high amount of broached cavities, as depicted in Figures 8 and 9. Metallographic analyses revealed shrinkage pores as cavities reaching depths of up to one millimetre. Therefore, when machining the cast surface, it should be considered that surface layer pores may be broached by the machining process and may result in similar long life fatigue strength reduction as observed for the cast surface texture.
To assess not only surface initiating cracks as in Cast-S specimens, but also surface layer porosity, the neural network has been retrained by means of four input variables on four neurons. The input variables are now the pore size √ area, the local surface pit depth Sv local , the equivalent notch root radius ρ and the statistical surface roughness parameter Sv rev . As this methodology may not be available to designers, and therefore is not easily engineering feasible, the authors study also on the simplified deduction of the interaction coefficient. The first results indicate that the application of the mean values presented in Table 6 for the whole associated sampling series leads to sufficient approximations.
Regarding a probabilistic Cast-S specimen fatigue assessment, the five biggest values Sv i of the surface pit depth Sv are valid for extreme value statistics. The evaluated distributions of the probabilistic fatigue model parameters Sv i and ρ i depend on the selected sub-area size. As illustrated, a too small sub area sized may not contain sufficient information about the amount and depth of critical surface pits, but is increasingly characterising only the most critical one and would lead to more conservative assessment. Within this study, it was found that for the investigated sand cast surface textures, a sub-area size of 1 mm × 1 mm is valid. It should be noted that this recommended value is a multiple of the sand grain size. Manufacturing processes resulting in finer surface structures, such as additively manufacturing, may use 0.5 mm × 0.5 mm sub-areas instead. As the presented surface long life fatigue assessment model uses the statistical surface roughness parameter Sv rev , the population of the data for distribution fitting should be substantial enough in order to facilitate sound distribution conformity. Within this study, it was found that evaluating about twenty specimens, with five Sv i values each, leads to stable distribution parameters of the extremal values. As both fatigue model parameters extremal notch valley depth and averaged notch root radius can be handled as independent random variables, their probabilities of both distributions can be multiplicatively combined, leading to a combined probability of occurrence P(Sv i ≥ y, ρ i ≤ x) and subsequently to the probabilistic cast surface long life fatigue strength σ LLF,s (P(Sv i ≥ y, ρ i ≤ x)). Thus, the presented methodology provides a statistically applicable design tool to assess the cast surface effect on the local fatigue strength.

Conclusions
Based on the results presented in this paper, the following conclusions can be drawn.

•
The presented surface layer fatigue assessment model is valid for aluminium sand cast surfaces in T6 and HIP treatment condition within the investigated range of Sv = 30 µm to 260 µm. Long life fatigue strength estimation results are approximately 6% to 9% conservative. • The HIP process does not reliably close surface layer pores within the first millimetre of surface layer depth. Therefore, in the investigated manufacturing showcase, the machining process has to remove at least one millimetre of the surface layer to increase the endurable long life fatigue strength by remove surface layer porosity. • Extremal surface roughness pits may become deeper, respectively, more critical, due to the HIP process. However, this is not compulsory for all investigated surfaces. • For probabilistic fatigue strength assessment, the sub-area size is meaningful. Sub-area side lengths have to be chosen properly according to the present cast surface texture. For the investigated sand cast aluminium surfaces, a sub-area size of 1 mm × 1 mm is valid. For statistical characterisation, the measured cast surface texture should cover about one-hundred sub-areas at least. • The statistically assessed surface texture parameters, used in the cast surface fatigue strength assessment model, Sv i and ρ i are independent variables and can both be statistically described by a GEV distribution. To reliably fit the distribution, at least 20 specimens should be measured, resulting in 100 Sv i and ρ i values. By means of a bivariate distribution, a probabilistic cast surface long life fatigue strength σ LLF,s (P(Sv i ≥ y, ρ i ≤ x)) can be subsequently calculated and used in fatigue design applications.