Characterization of Austenitic Stainless Steels with Regard to Environmentally Assisted Fatigue in Simulated Light Water Reactor Conditions

: A substantial amount of research effort has been applied to the ﬁeld of environmentally assisted fatigue (EAF) due to the requirement to account for the EAF behaviour of metals for existing and new build nuclear power plants. We present the results of the European project INcreasing Safety in NPPs by Covering Gaps in Environmental Fatigue Assessment (INCEFA-PLUS), during which the sensitivities of strain range, environment, surface roughness, mean strain and hold times, as well as their interactions on the fatigue life of austenitic steels has been characterized. The project included a test campaign, during which more than 250 fatigue tests were performed. The tests did not reveal a signiﬁcant effect of mean strain or hold time on fatigue life. An empirical model describing the fatigue life life in agreement with predictions based on NUREG/CR-6909. A limited sub-programme on the sensitivity of hold times at elevated temperature at zero force conditions and at elevated temperature did not show the beneﬁcial effect on fatigue life found in another study.


Introduction
According to the most recent report of the Intergovernmental Panel on Climate Change (IPCC), "Nuclear energy is a mature low-GHG [green house gas] emission source of baseload power, but its share of global electricity generation has been declining (since 1993). Nuclear energy could make an increasing contribution to low-carbon energy supply, but a variety of barriers and risks exist" [1]. Hence, long-term operation (LTO) of the current fleet of nuclear power plants (NPPs) can make an important contribution to controlling GHG emissions, especially in the short term. However, this requires proper understanding of the relevant damage mechanisms in NPPs.
Environmentally assisted fatigue (EAF) is one of these damage mechanisms; test programmes in Japan, the U.S. and later in Europe have shown that the water environment in NPPs reduces the fatigue life N f significantly. Nevertheless, EAF was not explicitly taken into account during the construction of the currently operating fleet of NPPs [2,3]. The most recent guidance for EAF assessment is the U.S. regulation NUREG/CR-6909, Rev. 1 [4], in its final version from May 2018, which is based on an extensive collection mainly of Japanese and U.S. data. In that document, the effect of the environment on N f is described by an environmental factor F en : where N f,air,RT is the fatigue life in air at room temperature and N f,LWR the fatigue life in the environment at operating conditions. However, the low cycle fatigue lives predicted by CR-6909 do not reflect current pressurized water reactor (PWR) plant experience where no failures attributed to environmental fatigue have been observed so far where the loading conditions were known [2]. Furthermore, studies on laboratory specimens found experimental fatigue lives to be longer than predictions based on CR-6909 [5,6]. This indicates that the guidance provided by CR-6909 includes significant conservatism, which could potentially be reduced without loss of operational plant safety. Accordingly, EAF has received much attention in the last few years [5][6][7][8][9][10][11][12][13][14][15].
Work by Chopra et al. presented a recent review with a focus on ASME Code section III [3] where the F en for austenitic stainless steels is described as: T * ,ε * and O * are functions of the environmental temperature, the positive strain rate and the dissolved oxygen content. Other parameters like surface finish and complex waveforms are not explicitly taken into account, but taken into account through constant subfactors [4]. However, some authors have observed cases where the combined effect of surface finish and the environment is less damaging than might be expected when considering both effects independently [10,11,16]. These findings suggest there might be interaction effects between the surface finish and the environment.
Similarly, a number of studies investigated the influence of the waveform and especially mean strain [8] and hold time periods on environmental fatigue [6,13]. While mean strain did not have a major effect on fatigue life, it turned out that at least under certain conditions, introducing hold times at some cycles during the fatigue life can extend the fatigue life of austenitic steels in the PWR environment.
The project INCEFA-PLUS (INcreasing Safety in NPPs by Covering Gaps in Environmental Fatigue Assessment) [17] was started in 2015 under the umbrella of the European Horizon 2020 programme to characterize some of this conservatism. It includes a major test programme with more than 250 (mostly strain controlled) fatigue tests in air and a simulated light water reactor (LWR) environment carried out in 11 European laboratories. While most of the tests were carried out according to a single test matrix that was optimized by the design of experiments method, some specific aspects were addressed in separate sub-programmes.
This work describes the test programme in detail and analyses the data from the main programme in which the effects of five test parameters, as well as their two-factor interactions are considered. The most relevant factors and interactions are identified. Two sub-programmes respectively address hold time effects and conditions under which less environmental impact is expected (i.e., smaller F en ).
The implications for actual plant assessment were discussed elsewhere [18]. The analyses presented there were based on an earlier data evaluation similar to the one presented here, but based on a slightly smaller database. The conclusions for fatigue assessment in plants are not affected by the small difference in the underlying database.

Materials
The large majority (86%) of the tests were carried out on a single batch (XY182 sheet 23201) of 304L stainless steel produced by Creusot Loire Industries. The remaining tests were carried out on a single batch of 321 (8%) and different batches of 304, 304L and 316L. The chemical composition of the different steels is listed in Table 1. All materials were annealed at temperatures between 1050 • C and 1100 • C. The annealing time was 5 h for the 321 material and between 0.4 and 2 h for the other steels.

Test Programme
The test programme consisted of a main programme and several sub-programmes dedicated to specific questions arising during the project.
The main programme initially aimed at studying the sensitivities of fatigue life N f to the parameters strain range ε r (difference between the maximum and minimum strain during the test), mean strain ε m (strain level in the middle between the minimum and the maximum strain in a test), hold time t h (period with constant strain), surface roughness R t and environment E. These parameters were selected based on the interest of the project partners and the EPRI gap report [2]. The main programme was divided into three consecutive phases (see Section 3.1 for details) to be able to refocus the testing once the first trends became apparent in the data. The data from Phase I did not show any effect of mean strain ( [19] and Section 3.1. 3), and this was dropped in the later phases. The factor mean strain ε m was introduced as the easiest means to simulate the constant load applied to NPP components during steady state operation. However, because of shake down early during the test, the mean strain did not have a significant effect on fatigue life. A sub-programme was started to simulate the constant load via tests with mean stress under strain control. The results of this sub-programme were published separately [20].
It also became apparent that applying hold times during some cycles in this study did not have a major effect on fatigue life. However, significant effects of hold times on fatigue life were reported in a different study [13]. To investigate whether differences in the application of holds led to these differences, a limited sub-programme on hold time tests was started (Section 3.3).
Furthermore, a small test programme with conditions where less environmental effects were expected (i.e., smaller F en ) than in the main programme was performed (Section 3.2).
In the absence of a dedicated standard for EAF tests in LWR conditions, the tests were performed as much as possible according to ISO 12106:2017, the standard for strain controlled fatigue testing [21] with additional guidance taken from other relevant standards such as ASTM E606 [22], ISO 11782-1 [23], BS7270:2006 [24] and AFNORA03-403 [25]. To reduce the scatter caused by differences in testing practices between the different laboratories, further guidelines were developed that provide more detailed guidance than is normally included in a testing standard [26]. All test data were uploaded in a dedicated materials database operated by the European Commission (MatDB) and have received digital object identifiers (DOIs) to ensure long-term storage and traceability.
As an additional quality assurance measure, each test was validated by a panel of fatigue experts from within the project and rated with regard to the quality of the test and the completeness of the information in the database. The test quality was determined on the basis of data like the cyclic stress amplitudes and hysteresis curves [26]. Where necessary, this information was complemented by microstructural characterization of the specimens [27]. From the strain controlled tests carried out during the project, ninety-four percent received a quality rating of one or two (out of four) and were accepted without restriction for analysis.
Besides these tests on uni-axial specimens, the project included also a sub-programme on membrane specimens. The results of this sub-programme were published separately [28].

Test Conditions
The main test programme addressed the influences of the factors strain range ε r , hold time t h , surface roughness characterized by the total height of the roughness profile R t , mean strain ε m and environment on the fatigue life of austenitic stainless steels. Preliminary data analyses yielded no indications of significant effects of mean strain and hold times on N f and were removed during the later phases of the test programme.
Each of the three test phases was optimized by means of the design of experiments (DOE) method [29]. As usual for experimental campaigns for linear models optimized by DOE, all factors were tested on two levels. In the case of continuous factors (like strain range), the minimum and maximum values in the interval of interest were chosen. Using the extreme values maximizes the sensitivity of the test result on the factor settings (because of the higher leverage of the extreme values compared to intermediate values). The only exception from this rule is the surface roughness: While all smooth specimens have a very similar surface roughness, the grinding process used to obtain the rougher surface finishes yielded a roughness distribution rather than a discreet value (Section 3.1.2). For categorical factors like hold time, the two levels were "without holds" and "with holds". Table 2 lists the test conditions applied in the main programme. The test conditions were selected to be as plant relevant as possible while keeping test durations realistic (especially for hold times and minimum strain range). The surface roughness is characterized for all specimens by the maximum roughness height R t and average roughness R a as specified in ISO 4287:1997 [30]. The smooth surface finishes achieved by polishing were very reproducible, so not all specimens were measured individually, and generic roughness values were used for most polished specimens. Because of the larger scatter in the surface roughnesses of the rough specimens, R a and R t for all ground specimens were measured individually by optical confocal profilometry according to ISO 4288:1997 [31]. In this work, R t is used rather than R a because that is the parameter that can be expected to have more impact on crack initiation: a deeper scratch leads to larger stress concentration, which facilitates crack initiation. As both values are strongly correlated (Figure 1), the choice of the surface characteristic is not expected to have a major impact on the analysis. Table 2. Test conditions in the main programme and the sub-programme on low F en testing.

Parameter
Low Level

Middle Level
High Level Comment  Figure 1. Correlation between R t and R a for the specimens in the database (throughout this work, the symbol indicates runout specimens). The ratio between R t and R a is 8.7.
The PWR and VVER (a Russian PWR design) chemistries are defined in Table 3. In some cases, slightly different water chemistries were used because of different practices in the national power plants. These differences are not expected to have a significant impact on fatigue life, but are recorded in the central database. All tests with the material 321 (and only these) were performed in the VVER environment. According to CR-6909, the F en for austenitic stainless steels can be formulated as [4]: where T * ,ε * and O * are the parameters derived from temperature, strain rate and dissolved oxygen content. For the conditions used in this work (Tables 2 and 3), these are defined as: For the test conditions in the main programme (T = 300 • C,ε = 0.01 1/s, Tables 2 and 3), Equation (3) yields F en = 4.57. Figure 2 shows the 170 tests that were included in the analysis of the data from the main programme [32]. Four of these tests were runouts, i.e., tests that were stopped for other reasons than specimen failure. These tests are considered as right censored data in the analysis. The mean air curve for austenitic steels from NUREG/CR-6909 [4] and the same curve divided by the F en are plotted for reference. indicate runouts, i.e., tests that were stopped before specimen failure (e.g., because of a technical problem with the test rig).

Data Overview
The definition of fatigue life N f used in this study is N 25 , i.e., the cycle where a reduction of the maximum cyclic stress of 25% compared to the extrapolated stabilized behaviour occurs. In cases where N X values other than N 25 are reported, these were converted to N 25 by means of Equation (18) in NUREG/CR-6909 [4]: While the majority of the tests in the environment were carried out using solid specimens in autoclaves, some data were acquired on hollow specimens where the water flows through the specimen. For hollow specimens, N f is generally the cycle where leakage occurs. This is considered a rough equivalent to N 25 [4].
Each organization used their own specimen type and geometry. For the air tests, the specimen diameters varied between 3.6 and 10.0 mm; the solid specimens for the tests in the environment had diameters between 3.6 and 9.0 mm. The hollow specimens had inner diameters between 9 and 12 mm.
Because of the internal pressure in hollow specimens, the stress state in hollow specimens is different from the membrane stress in solid specimens. It is therefore not obvious that the fatigue lives obtained with both types of specimens can be compared directly [33][34][35]. A study carried out within INCEFA-PLUS led to the conclusion that no significant effect on the mean values is expected for the data discussed here [36] (this analysis was done on an earlier (smaller) data set, but has been confirmed with the final dataset). Therefore, no further distinction between the two types of specimens is made here. For hollow specimens, the strains are used directly as measured, and no strain correction as suggested in [35] was applied.
The distribution of the independent variables in the main programme is summarized in Figure 3. The relatively low number of tests with a positive mean strain ε m and a positive hold time t h reflects the fact that these parameters were dropped in Test Phases II and III, respectively ( Table 2).

Data Analysis
Before starting the actual data analysis, it is useful to check for possible correlations. The correlation r i,j between the input parameters x i and x j is given by: where x is the mean of x. The correlation r i,j can take values in the interval [−1;1]. Values of |r i,j | close to 1 indicate strong (anti-)correlations. If |r i,j | is close to 0, x i and x j are not correlated. A strong correlation between x i and x j means that tests with high values of x i also tend to have high values for x j . Similarly, a strong anticorrelation between x i and x j means that high values for x i are often associated with low values for x j . Strong (anti-)correlations between the inputs can easily lead to the wrong conclusions during the evaluation because the associated effects cannot be separated. The three phases of the main test programme were optimized by the design of experiments method [29], which also minimises the correlation between the factors. However, the available collection of tests varied from the planned test matrix, since some tests were invalid or not carried out as specified. Furthermore, additional data were contributed by some project partners, and some test conditions were modified during the project. These circumstances could have introduced correlations between the independent variables. Table 4 lists the correlations between the factors in the main programme. The largest (anti-)correlations were found between ε m and R t and between ε r and R t . An anticorrelation between ε m and R t was expected since in Phases II and III, no tests with holds were carried out any more, whereas in Phase II, a higher surface roughness R t was introduced. Therefore, one would expect tests with holds to have on average lower R t and hence an anticorrelation between R t and t h . The correlation between ε r and R t , however, is unexpected. Most likely, it is a random effect resulting from the grinding process that was used to produce the rough surface finishes and that yielded a distribution of surface roughnesses rather than specific R t values ( Figure 3). These two largest (anti-)correlations were below 0.15 and should not have a major impact on the evaluation. The actual data analysis was based on a second degree factorial model, i.e., a model including the main effects and all second order interactions: The x i are the different factors (such as R t ). The parameters α i and α ij are the model parameters for the main effects and the two factor interactions, and I is the intercept. For every test, an equation like Equation (9) is formulated. The best model is the model for which the parameters α i , α ij and I best describe the experimental data. A lognormal distribution for N f is assumed as recommended in ISO 12107 [37]. In a lognormal distribution, the expected (i.e., mean) value X of the lognormally distributed variable X is: µ and σ are the mean and standard variation of the natural logarithm of X.
The model parameters in Equation (9) depend on the scaling of the factors x i . Normalizing the factors to the range [−1;1] allows comparing the impact of the different main and interaction effects by simply comparing the corresponding α parameters. Table 5 lists the normalization conventions for the factors in the main programme. In this work, the superscript "() n " indicates normalized factors, e.g., R n t is the normalized R t . For consistency, also the categorical factors like the environment E are labelled similarly (E n ). The aim of the current study is not only to obtain a numerical model that allows predicting the fatigue life of a specimen under a specific set of test conditions, but especially to determine which of the investigated factors have a significant impact on fatigue life. Therefore, the selected model should not only describe the data, but also include only those variables that have a significant effect on fatigue life. Many algorithms are available for fitting a model to the data. For the present study, we chose the backward elimination [38] method. This algorithm starts with a full model, including all factors and interactions that are being considered (in this case, a second order factorial model, Equation (9)) and evaluates the predictive performance of this model. In the next step, one model parameter (main effect or interaction) is removed, and the performance of the reduced model is evaluated. This procedure is repeated iteratively until only the intercept is left. This approach allows more easily comparing models with different numbers of factors than is the case for other algorithms that do not eliminate factors at all or where the number of factors is not changed in every step.
The model that best fits the data is not necessarily the most useful model since models with more parameters can easily overfit the data (i.e., fit the noise). Two approaches were used here for model selection. In the first approach, the data set is divided into a training set and a validation set. The data in the training set are used to determine the model parameters α i . The data in the validation set are then used to evaluate the predictive performance of the model. Since the data in the validation set were not used to determine the model parameters, the predictive performance of the model on the validation set is a good measure for the model performance under new conditions within the parameter range in which the model was optimized.
From Figure 2, it is clear that the data sets can be roughly separated into four distinct groups by the two levels of ε r and E. The training and validation sets are selected in such a way that 75% of the data in each of the four groups are in the training set and 25% in the validation set. This approach is shown in Figure 4a, where the -LogLikelihood for the training and the validation sets is plotted over the iteration steps of the algorithm. The -LogLikelihood, the negative natural logarithm of the likelihood function, is a measure for the goodness of fit, whereby smaller numbers indicate a better fit. The iteration steps of the algorithm start with Step Number 0, i.e., the full model including all main effects and all two parameter interactions. Moving on the abscissa left allows following the progression of the algorithm until at the leftmost step (here, Step 15), only the intercept remains. Figure 4. Comparison of the model performances as a function of the step in the algorithm, i.e., the number of factors that were removed from the model. Note that progression on the abscissa is from right to left. The vertical red line indicates the optimal model according to the algorithm. (a) -LogLikelihood for the training and validation sets. (b) BIC for the full data set; the green area indicates "very good" model performance (strong evidence that a model is comparable to the best model); the yellow area indicates "good" model performance (weak evidence that a model is comparable to the best model) [39].
The dashed line refers to the training set. The -LogLikelihood for the training set rises continuously with the progression of the algorithm (from right to left). This is expected since reducing the number of terms in the model will necessarily lead to worse fits. The behaviour of the solid curve for the validation set is different: initially, the -LogLikelihood drops until it reaches a minimum in Step 10 (indicated by the vertical red line) and continuously rises from there. This means that the model that best describes the validation set is reached in Step 10 of the algorithm. The corresponding model coefficients are listed in Table 6 Model (a) (Appendix A).
An alternative approach for selecting a model and to avoid overfitting is using a measure for the quality of the fit that penalises models with a larger number of parameters. The Bayesian information criterion (BIC) is such a measure. It is defined as: where k is the number of parameters in the model and n the number of data points. As for the -LogLikelihood discussed above, lower values of BIC indicate a better fit. The second term of the sum in Equation (11) penalizes models with more parameters. Figure 4b shows the BIC for the different steps in the backward elimination algorithm for the full data set. The best model is again reached in Step 10; the corresponding model coefficients are listed in Table 6 Model (b). Table 6. Coefficients for the best models in Figure 4. Note that the normalized versions of the factors need to be used (Table 5); in the case of the categorical variable E n the coefficient is zero for E n = −1 and the value in the table for E n = +1. The p-value in the last column is an indication of the statistical significance of an effect; a threshold of 0.05 is often used as criterion for statistical significance with lower values indicating higher significance. σ is a parameter in the lognormal distribution (Equation (10)).

Discussion
Comparing the model coefficients listed in Table 6 Models (a) and (b) shows that both models include the same terms, namely the main effects ε n r , E n and R n t , as well as the two interactions ε n r *E n and ε n r *R n t . The estimates for the coefficients of all three main effects are negative, indicating their detrimental effect on fatigue life.
The estimated factor for the interaction ε n r *E n is positive; large values for either ε n r or E n , i.e., large strain ranges or testing in the LWR environment therefore partly compensate the negative effects of ε n r and E n ; at high strain ranges, there is less environmental effect. This is consistent with the observation reported in [3].
Similarly, the positive coefficient related to the interaction term ε n r *R n t reduces the negative impact of a high surface roughness at high strain ranges. This is understandable: R t affects crack initiation rather than crack growth, so one would expect R t to have a more deleterious impact in situations where fatigue life is dominated by crack initiation, i.e., at low strain ranges, which is what the models predict.
Models (a) and (b) were determined using the same algorithm (backward elimination), but with different validation methods. Published and project internal analyses with different algorithms and slightly different data sets consistently showed the main effects ε r , E and R t to have the largest impact [40]. In most cases, one or two two-factor interactions were found to be statistically significant, but not practically relevant, i.e., they did not have a major impact on the predicted fatigue life. The interactions that were found to be statistically significant varied in evaluations with increasing size of the data set and depending on the algorithm used for the model optimization. This may indicate that the size of these effects is at the limit of what is detectable with the number of tests available in this work. This is confirmed by the optimization curves (the solid black lines) for both models in Figure 4. In both cases, the best model is found in Step 10, but the performance of the models in Step 9 or 11 is very comparable. A further reduced model, including only the main effects, was therefore calculated (using the BIC validation); the model parameters are listed in Table 7. Table 7. Coefficients for a reduced model including only the main effects. Note that the normalized versions of the factors need to be used (Table 5); in the case of the categorical variable E n the coefficient is 0 for E n = −1, and the value in the table for E n = +1.  Figure 5a compares the N 25 predicted by the three models to the experimentally observed values. As could be expected from Table 6, Models (a) and (b) are hardly distinguishable. Only at very high N 25 do differences become apparent. Model (c), which only includes the main effects, differs visibly from the other two models. For high fatigue lives, Model (c) systematically predicts lower N 25 , whereas the contrary can be observed in the medium N 25 range around 4000 cycles. In the region where N 25 is around 1000 cycles, all three models match well in general, with Model (c) deviating from the others in some cases. These differences result from omitting the interaction effects. However, the differences between the reduced model (c) and the optimal models (a) and (b) is small compared to the scatter observed experimentally. Therefore, Model (c) seems to be good enough to make realistic predictions. During the analysis, all tests were considered to be either carried out in air or in the LWR environment, where the LWR environment included simulated PWR, as well as simulated VVER conditions, and no distinction was made between the latter two. Furthermore, all tests in the VVER environment (and only these) were performed on a 321 steel. The question is if considering the PWR and VVER tests was a sensible approach. The VVER data in Figure 5b are distributed around the black reference line and do not show any particularities. Hence, based on the data available here, the model describes the VVER data just as well as the PWR data. Similarly, the model predictions work equally well for different strain ranges ε r (c) and surface roughnesses R t (d). The effect of R t on the predicted fatigue life is visible by the separation of the blue points with very low and the grey/red points with higher R t values. The gap between these two groups is higher for larger fatigue lives, showing the interaction between R t and ε r .

Test Conditions
In this sub-programme, a limited number of tests were carried out at conditions with a lower F en than in the main programme. From Equation (3), it follows that without changing the water chemistry (i.e., the DO content), the approaches that allow reducing F en are reducing the temperature T and increasing the (positive) strain rateε. The maximum strain rate that could be achieved in all autoclaves in the project was an increase by a factor 10 compared to the main programme, i.e.,ε = 0.1%/s. This leads to a F en = 2.68; the same F en is obtained by reducing T to 230 • C ( Table 2).

Data Overview
Only a limited number of tests was available for the test programme at reduced F en . Here, only tests with a strain range ε r = 0.6% in the LWR environment are considered. Forty-nine tests were available for the analysis [41], of which 15 were at the lower F en , with eight tests at reduced temperature T and seven tests at increased positive strain rateε. Data from the main programme at the positive strain rate 0.01 %/s were used as reference data. Some of these tests were carried out with mean strain or hold times. However, since the analysis of the data in the main programme did not reveal any mean strain or hold time effects, these parameters are not considered in context with the low F en data. The fatigue lives of the tests used in the low F en analysis are plotted in Figure 6 and the distributions of the most relevant test parameters in Figure 7.

Data Analysis
Because of the limited number of tests available for this sub-programme, no tests with reduced temperature T and increased strain rateε were carried out. This gap in the test matrix is reflected in the correlation between T andε in the correlation matrix (Table 8). It should also be noted that the small number of tests led to a reduced spectrum of R t in the low F en data. For both groups of low F en data (with reduced T and with increasedε), the maximum R t is around 30 µm, whereas for the group at higher F en , it is almost 50 µm.  Table 9 shows the normalization definitions for the factors in the low F en programme. Notice that the spectrum of R t values is smaller than in the main programme, which leads to a slightly different normalization. The small number of tests available in the two low F en groups makes it impractical to divide the data into a training and a validation set. Therefore, only the BIC method described in Section 3.1.3 is used for the analysis of the low F en data. As before, the optimal model is determined by means of the backward elimination algorithm. The initial model includes all three main effects R n t ,ε n and T n , as well as the two-factor interactions R n t *ε and R n t *T n . Since no data with highε and low T are available, no information about a possible interaction between these two parameters is present in the data.
The plot with the different steps of the backward elimination algorithm is shown in Figure 8; the parameter estimates for the optimal model, which includes only the main effects, are listed in Table 10.

Discussion
As would be expected, higherε, as well as lower R t and T increased the fatigue life. From the coefficients in Table 10, it is clear that increasingε from 0.0001/s to 0.001/s and reducing T from 300 • C to 230 • C had the same beneficial effect on fatigue life. In the range of parameters studied here, the effect of R t was ca. 40% stronger than that of the other two parameters.
In Table 11, the fatigue lives for polished specimens (R n t = −1) are calculated for different settings of T n andε n . The first row corresponds to the conditions in the high F en programme. In the second and third row, the fatigue lives for the two means of reducing the F en are calculated. As expected, reducing T n and increasingε n yield virtually the same predicted N 25 . The ratio between the predicted N 25 values at low and high F en conditions is 1.5. This is reasonably close to the ratio of the high and low F en values (1.7). Table 11. Fatigue lives calculated with the model in Table 10.
Other algorithms led to models that also included the statistically significant interactions R n t *ε and R n t *T n . In these models, the coefficient for R n t *ε was negative, and the coefficient for R n t *T n was positive. That would mean that in the parameter range studied here, the fatigue life of polished specimens would be more sensitive to the variations of positive strain rate and temperature than the fatigue life of specimens with a ground surface finish. However, the error bars of the more complex models overlap with the error bars of the simpler model in Table 10 so that from an application point of view, there is no practical difference between the models, and the simpler one can be used.

Sub-Programme on Hold Times
A series of tests that included hold times was performed in Phases I and II of the INCEFA-PLUS test programme (Section 3.1). The tests on hold time effects in Phases I and II were carried out in strain-control at the strain ranges 0.6% and 1.2% ( Table 2). The hold periods were introduced at the position in the cycles where the mean strain (0% or 0.5%) was reached with a positive strain rate. Holds of 72 h were introduced in three cycles per tests; the cycles with holds depended on the test conditions. For tests in air, holds were added at 6000 cycle intervals starting from the 6000th cycle for strain range 0.3% and 1000 cycle intervals starting from the 1000th cycle for 0.6%. Preliminary analysis including Phase II tests ( [40,42], confirmed in Section 3.1.3) suggested that there was no observable effect of hold times for the tested conditions, although beneficial effects of hold time on fatigue life have been reported by the AdFaM (Advanced Fatigue Methodologies) project [13].
Therefore, hold times were removed from Phase III testing, and in parallel, a subprogramme on hold time effects was initiated where the test conditions were more closely aligned to those used in the AdFaM study [13].

Data from Hold Time Testing
According to [13], hold time effects were most prominent at low strain amplitude and with holds at zero stress at elevated temperature. To increase the chances of observing a hold time effect, the strain range in the sub-programme on hold time effects therefore was reduced to 0.4%. Furthermore, holds were performed under zero load rather than in strain control (in some cases, at non-zero mean strain) as in the main program. The holds consisted of three 72 h holds at 350 • C at 10,000 cycle intervals starting from the 10,000th cycle. The temperature during hold times was increased from 300 • C to 350 • C. Cycling was carried out at room temperature or at 300 • C. All tests were performed in air on the common batch XY182 of 304L [43]. Figure 9 shows the evolution of the maximum stress per cycle during the test. The tests "EDF AIR 2" and "LEI-21" are the only tests that did not have holds.
Three of the seven tests (specimens "EDF AIR 2", "LEI-19" and "LEI-22") were carried out at 300 • C, whereas the other four were performed at room temperature. The tests with the specimens "EDF AIR 2" and "LEI-21" were the only ones without holds.

Discussion of Hold Time Data
The reducing effect of the increased temperature on the stress level and fatigue life is obvious from Figure 9. The hardening effect of the hold periods is visible from the peaks in the maximum stress at 10,000, 20,000 and 30,000 cycles. The curves for the tests at 300 • C showed a primary hardening followed by softening and secondary hardening before failure occurred around Cycle 100000. The maximum cyclic stresses of the three tests evolved in very similar manner-especially given that they were tested in two different laboratories (LEI and EDF). The hold times led to hardening, but there was no long lasting effect in either stress level or fatigue life.
The situation for the tests at room temperature was different: until the first hold at Cycle 10000, the stress curves evolved in parallel even if there were some differences in absolute stress values. The first hold (at 350 • C at zero stress) then hardened the material, similar to what was observed for the tests cycled at 300 • C. However, the stress increase was much higher and decayed more slowly when cycling restarted. Furthermore, for the remainder of the tests, the three tests with holds reached higher stress levels compared to the reference test ("LEI-21") than they had before the holds. The second and third hold times seemed to have less effect. The increased stress, however, did not seem to have an impact on fatigue life. In particular, no extension of fatigue life as reported in [13] was evident (two of the three tests with holds were actually shorter than the reference test without holds). The reason for that discrepancy with the AdFaM results remains unclear; it might be that the number of hold periods played a role. In the tests reported in [13], hold periods were applied throughout the test, so depending on the conditions, there were many more than just three hold periods in a test.

Conclusions
A major test programme on strain controlled fatigue in air and LWR conditions was carried out. The main programme with F en = 4.57 investigated the effects of strain range, mean strain, hold time, surface roughness and environment on the fatigue life of austenitic steels. The test matrix was optimized by the design of experiments methodology. A linear model taking into account possible interactions was determined. No influences of hold time and mean strain were identified. The test data could be described by a model including only the main factors strain rate, environment and surface roughness. The interaction effects of strain range with the environment, as well as surface roughness were found to be statistically significant, but of limited practical relevance.
In a sub-programme at a lower F en = 2.68, the influences of temperature, positive strain rate, as well as surface roughness were studied. Because of the limited number of tests, not all possible interactions could be addressed. No firm evidence for an interaction of surface roughness with either temperature or strain rate was detected. In the parameter range investigated, the effect of surface roughness was slightly larger than the effects of temperature and strain rate. As predicted by NUREG CR-6909, the reduction of the temperature from 300 • C to 230 • C in the LWR environment was found to have the same effect on fatigue life compared to F en = 4.57 as the increase of the positive strain rate from 0.0001/s to 0.001/s. Finally, a limited number of tests in air with holds at elevated temperature under no stress conditions did not find evidence for beneficial effects of hold times on fatigue life like those found in another study. The reason might be the difference in the number of hold time periods.

Informed Consent Statement: Not applicable.
Data Availability Statement: Requests to access the data presented in this study can be submitted to the data owner(s). The data are stored in a database and traceable through DOIs [32,41,43] but are not publicly available due to the data policy of the project.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.

Abbreviations
The following abbreviations and symbols are used in this manuscript: