Round Robin into Best Practices for the Determination of Indentation Size Effects

The paper presents a statistical study of nanoindentation results obtained in seven European laboratories that have joined a round robin exercise to assess methods for the evaluation of indentation size effects. The study focuses on the characterization of ferritic/martensitic steels T91 and Eurofer97, envisaged as structural materials for nuclear fission and fusion applications, respectively. Depth-controlled single cycle measurements at various final indentation depths, force-controlled single cycle and force-controlled progressive multi-cycle measurements using Berkovich indenters at room temperature have been combined to calculate the indentation hardness and the elastic modulus as a function of depth applying the Oliver and Pharr method. Intra- and inter-laboratory variabilities have been evaluated. Elastic modulus corrections have been applied to the hardness data to compensate for materials related systematic errors, like pile-up behaviour, which is not accounted for by the Oliver and Pharr theory, and other sources of instrumental or methodological bias. The correction modifies the statistical hardness profiles and allows determining more reliable indentation size effects.


Introduction
Nanoindentation is extensively used to provide information about the mechanical behaviour of materials at the nano scale through the evaluation of force versus displacement curves measured during instrumented indentation. The most frequently used technique to calculate indentation hardness and moduli of materials is based on Oliver and Pharr's method for the determination of the contact depth by accounting for the curvature of the unloading segment of the force-displacement data as described by a power law [1,2]. Recently, dynamic measurements in continuous stiffness measurements (CSM) mode have appeared where a small sinus oscillation is superimposed to the quasi-static load cycle and contact stiffness is continuously measured during loading. However, CSM methods present still some challenges to evaluate to what extent the measurement mode affects the results of the test. Indeed, a dependence of the derived mechanical properties on the oscillation parameters has been reported [3], as well as mismatches between static and dynamic indentation hardness due to strain rate sensitivities [4] and an influence of instrumental artefacts on the measured stiffness [5]. Beyond the measurement of mechanical properties at small scale, nanoindentation can also be used as an experimental tool for studying fundamental materials physics such as the formation of dislocation networks, plastic instabilities, and phase transformations [6]. Continuous refinement of experimental and modelling methodologies to assess mechanical properties at local level is enabling nanoindentation as a technique to obtain microstructural information [7][8][9][10][11], notably through the exploitation of indentation size effects (ISE) [12][13][14][15], which represents a link between small-scale mechanical and microstructural properties. However, the usefulness of the extracted information depends on the reliability of nanoindentation measurements. Common uncertainties in nanoindentation measurements cause bias and scatter of the measured values and originate from the key calibrations of the instrument (Displacement, Force, Indenter area function, Frame compliance), the zero-point determination to establish the initial depth of penetration, and from the random noise contributions from the environment such as ground and acoustic vibrations causing variation in force and displacement measurement. The results obtained are also affected by the models used for the evaluation of data and the data analysis corrections for variation in the exponent of the power law fit to the data and lateral dilation correction of the indentation contact [16]. Specimen roughness can affect both the zero point and the actual area of contact and residual stress (which may be intrinsic to the sample manufacturing route or polishing-induced) causes an error in the contact mechanics estimate of the actual contact area. Joslin and Oliver [17] have presented a method to remove the errors due to surface roughness by analyzing the composite parameter hardness/modulus 2 (H/E 2 ) instead of treating hardness and modulus separately. Other potential errors include surface forces/adhesion and material exhibiting pile-up or sink-in behaviour, which can vary across the same sample, depending on the ratio of local yield stress (and so hardness) to elastic modulus and the orientation of the indenter geometry to the local crystal orientation. Awareness about the possible influences and errors in nanoindentation measurements is critical to elaborate practices and methodologies that eliminate or reduce them or take them into account. A more detailed consideration regarding estimation of uncertainties in instrumented indentation can be found in [18] and in ISO 14577:2015-Annex H. Once valid indentation data have been obtained, there remains the issue of indentation size effects, where smaller indentations are harder because the yield stress of materials that deform via dislocation generation and movement has a fundamental length-scale dependence [7,8,19,20].
T91 and Eurofer 97 are tempered martensitic steels which are candidate materials for structural components in nuclear fission and fusion reactors. To predict the long-term material behaviour at operating conditions in nuclear environments, their deformation behaviour under high irradiation dose levels must be characterized. Nanoindentation and other more recent micromechanical testing approaches have proved promising to assess radiation damage, either caused by ions or by neutrons, thanks to the possibility of testing shallow depths affected by ion irradiation and/or small volumes of activated materials after neutron irradiation [21][22][23]. Due to the limited availability of irradiated samples, developing methodologies for robust characterizations as performed in different laboratories Nanomaterials 2020, 10, 130 3 of 15 and by different instruments is advantageous. The methodology should take into account pile-up formation, since the ferritic/martensitic steels exhibit high dislocation density and therefore, significant resistance to dislocation motion and low strain hardening capability, which may force the material upwards during the indentation process [16]. The Oliver and Pharr procedure [1] will then produce inaccurate results because it does not take modifications of the contact area due to pile-up behaviour into account [24]. In line with the original work by Joslin and Oliver [17], it has been recently demonstrated that uncertainties related to the contact area determination (pile-up and residual stresses) can be compensated during data analysis if the elastic properties of the material are a priori known and a correction factor can be applied to the hardness values [23,25,26].
In this framework, seven laboratories have engaged in a round robin testing campaign to probe T91 and Eurofer97 surfaces prepared by the same sample preparation method and using the same measurement protocols in various quasi-static nanoindentation testing modes. This work presents an analysis of the nanoindentation data obtained by the different devices, accounting for uncertainties in the contact area, in order to define a best practice methodology for the determination of indentation size effects.

Materials
Two ferritic/martensitic steels, namely T91 and Eurofer97, were used for this study. The chemical composition of the steels is given in Table 1. T91 specimens were cut from a hot rolled plate normalized at 1050 • C during 1 min/mm (per mm thickness), quenched to room temperature, tempered at 770 • C for 3 min/mm and then cooled in air. Eurofer97 samples were cut from broken Charpy specimens prepared from forged bars hardened at 979 • C for 1 h 51 min and tempered at 739 • C for 3 h 42 min. The materials and their tempering treatment were chosen for their featuring a nanoscopic martensite lath structure with characteristic lath sizes in the range of 100 to 200 nm [13] that is fine enough to have nanoindentation probe an effective medium thereby minimizing the effect of grain structure and crystallographic orientation on hardness measured. The materials were cut into plates of 1 mm thickness and polished with successively finer abrasives and polishing solutions (diamond suspensions of 9 µm, 3 µm and 1 µm particles) finalising by a gentle manual polishing in oxide polishing suspension with silica nanoparticles for 5 min. All samples were polished in one laboratory and distributed to the laboratories participating in the study. The roughness of the surface was checked by Atomic Force microscopy to be below 20 nm (Figure 1). The surface residual stresses were checked in one sample of each material using X-ray diffractometry. The values in both materials were close to 425 MPa compressive stress.

Nanoindentation Tests
Indentation tests were conducted in a variety of nanoindentation test devices from several providers (Anton Paar, Corcelles, Switzerland; Bruker, Santa Barbara, CA, USA; Micro Materials Ltd., Wrexham, UK; MTS, Eden Prairie, MN, USA; Keysight (former Agilent), Santa Rosa, CA, USA; Zwick-Roell, Ulm, Germany). The thermal drift at room temperature was below 0.05 nm/s for the different nanoindentation systems used in this study. The maximum load of the devices varied from 10 mN to 10 N, the load resolution from 1 nN to 100 nN and the displacement electronic resolution from 0.3 pm to 50 pm. However, no distinction about the instruments' capabilities is made in the aggregation of data and statistical analyses that follow.
All tests were conducted at room temperature using Berkovich diamond tips. The tip area function and the instrument frame compliance were calibrated according to ISO 14577-2:2015. Three quasi-static nanoindentation measurement modes have been applied for the comparison of ISE, namely force controlled single cycles (FSC), depth controlled single cycles (DSC) and progressive multi-cycles in force control (PMC). FSC measurements were performed at five maximum forces, F max , equal to 1 mN, 5 mN, 10 mN, 50 mN and 100 mN, using 30 s of loading and unloading ramp times and 10 s dwelling at F max . The instruments with maximum load below 100 mN performed the measurements at the F max values divided by 10. DSC measurements were performed up to maximum depths, h max , ranging from 50 nm to 500 nm, using 30 s of loading and unloading ramp times and 10 s dwelling at h max . PMC measurements were performed applying 10 consecutive loading-unloading cycles with the force being increased by 0.1·F max in each cycle from 10 mN to 100 mN (or from 1 mN to 10 mN for the systems reaching maximum loads below 100 mN). In every cycle the loading, unloading and dwelling times were set to 10 s and the unloading was conducted down to 30% of the maximum force of the cycle. For drift correction, a final 60 s dwelling at 10% of F max or h max was applied to all measurements prior to complete unloading. At least 15 measurements were taken and averaged for each F max in FSC and for each h max in DSC. As well, 15 PMC measurements of 10 cycles were taken and the results of the 15 measurements were averaged cycle by cycle.

Data Analysis
The data were analysed according to ISO 14577-1:2015 and using Oliver & Pharr methodology [1] after application of zero point and thermal drift corrections. Contact depth h c , indentation hardness H IT , and reduced plane strain modulus of the contact E r , are determined by Equations (1)- (3): Nanomaterials 2020, 10, 130 where h max is the depth at maximum force, h r is the tangent depth, ε is a correction factor dependent on the indenter geometry and the extent of plastic yield in the contact (0.6 < ε < 0.8), A p is the projected area of contact, S is the stiffness, and β is a geometric factor set to 1.034 for a Berkovich indenter. β is introduced to correct the analysis equations, based on the geometry of an axis-symmetric cone, to the shape of a Berkovich indenter [27].

Hardness and Modulus Profiles
All samples were prepared by one laboratory while the samples were then tested at different locations by methods agreed in advance and in controlled environments with stable levels of temperature and humidity. Therefore, the test results are not expected to depend primarily on the laboratory environment, the ageing of the materials, surface oxide formation or any other time-dependent response, assuming a proper correction for thermal drift has been made. All measurements have an inherent variability due to the random uncertainties of the test method. Between different laboratories and instruments, additional offsets are possible. Typical sources of variability and offsets between different laboratories are due to: • calibration differences (force, displacement, the calibration of the indenter area function and the correction of the frame compliance) and other measurement uncertainties, • sample to sample property variations (compositional variation, polishing differences, residual stress, etc.), • or could be due to differences in the details of the analysis methods applied in the software of each instrument: Oliver and Pharr uses a beta factor of 1.034, whereas ISO 14577:2002 uses a factor of 1; software compliant with ISO 14577:2015 applies a variable ε (i.e., determines a correction factor for ε which depends on the exponent of the power law fitting the unloading curve) and a lateral dilation correction to the contact area calculation (which depends on the hardness to elastic modulus ratio of the test piece) [28]. Figure 2 shows the hardness and reduced modulus of T91 and Eurofer97 as a function of contact depth measured in the different laboratories using FSC, DSC and PMC nanoindentations, while Figure 3 shows the hardness profiles combined for all measurement methods and laboratories. The datasets used to plot Figures 2 and 3 are provided as Supplementary Materials (see Spreadsheet S1). The hardness profiles ( Figure 2) exhibited comparatively low scatter for FSC and PMC methods, whereas a large scatter was observed in the modulus values. The calculation of hardness depends on the calibration of the projected area and the determination of the contact depth. In addition to these factors, the calculation of the modulus depends also on the determination of stiffness. The prevalence of scatter in the modulus values indicates that there is a higher uncertainty in the determination of stiffness by the different testing devices than in determining contact depth. The extreme outlier observed in Figure 3 (Lab 7, yellow data) is probably due to an inaccurate instrument calibration, in particular a largely underestimated frame compliance.

Elastic Modulus Correction (EMC)
T91 and Eurofer97 are relatively high-strength and low-strain hardening materials, with a moderate modulus to hardness ratio (E/H) in the order of 60. Because of these properties, one may suspect the formation of pile-ups during indentation. Indeed, the values of h p /h max are about 0.9, well beyond the 0.7 threshold above which pile-up occurs [2,19]. Effects of pile-up are mainly reflected in a systematic error in the determination of the projected contact area and, thus, an overestimation of hardness and modulus. To account for these effects, an elastic modulus correction [25,26] has been applied to the hardness values whereby hardness is corrected by a factor depending on the ratio of the measured reduced modulus to a reference reduced modulus value, assuming that the elastic modulus is independent of depth. The reference value for the reduced modulus, E r ref , has been calculated by Equation (4): 1 where υ s is the Poisson ratio of the steel samples set to 0.3, υ i is the Poisson ratio of the diamond indenter set to 0.07, E i is the indenter modulus set to 1141 GPa, and E s ref is the macroscopic elastic modulus of the steel samples, in this study set to 208 GPa for T91 and 217 GPa for Eurofer97. Applying Equation (4), the reference reduced moduli for T91 and Eurofer97 are 190.6 GPa and 197.4 GPa respectively.
We note that EMC is equivalent to analyzing H IT /E r 2 , or according to Equations (2) and (3), F max /S 2 which was applied by Joslin and Oliver in order to remove the errors due to surface roughness [17]. As these authors noted H IT /E r 2 is a better indication of the material's resistance to permanent indentation than hardness or modulus alone. Figure 4 shows the corrected hardness profiles of T91 and Eurofer97 measured in the different laboratories using FSC, DSC and PMC nanoindentation, while Figure 5 shows the corrected hardness profiles combined for all measurement methods and laboratories. EMC provides a correction of individual H IT values. Comparing Figures 3 and 5 it is evident that EMC increases the scatter in the results of each laboratory, while it decreases inter-laboratory sources of scatter and helps to identify outliers (e.g., Lab 4, Lab 5 and Lab 7 in Figure 4a, or Lab 5 and Lab 6 in Figure 4b). The latter improves data quality and comparability as it could effectively correct for calibration offsets between laboratories, as can be seen by the fact that the extreme outlier observed in the raw indentation hardness profiles (yellow data, Lab 7 in Figure 3) get much closer to the curves obtained in the rest of laboratories after EMC is applied ( Figure 5). EMC also reduces indentation size effects by correcting the overestimation in hardness due to pile-ups.

Cross-Correlation Analysis before and after EMC
A statistical study has been carried out to investigate which combination of measurement method and analytical correction works better to minimise intra-and inter-laboratory variations. A number of mathematical functions have been proposed in the literature to describe ISE. Their basis has ranged from empirical (Hall-Petch relation [29,30]) through to physical based arguments such as strain gradient plasticity (Nix-Gao model [31]) and slip distance theory (Hou-Jennett model [7]). One simple function is an exponential function. For the current statistical study, the raw data and the elastic modulus-corrected hardness profiles have been fitted to exponential functions (Equation (6)), serving as reference for analysing standard deviations in depth-dependent hardness and evaluating the quality of data: The choice of the exponential functions is motivated by the fact that this class of functions offers sufficient variability to describe the decay of hardness with increasing depths while getting along with a minimum number of (three) fit parameters. Hence, the choice represents a matter of practicality for quantitatively analyzing the combined inter-and intra-laboratory scatter in terms of the goodness of fits, not a choice made on physical grounds in order to determine an ISE accurately.
The EMC hardness plots (Figures 4 and 5) allow outliers to be identified and excluded from the characterization. The corrected hardness with exclusions has also been fitted to exponential curves and statistically analysed in terms of the deviation between measured and expected values (goodness-of-fit) as well as of degree of correlation with the measured elastic properties (cross-correlations of hardness and reduced modulus). Figure 6 shows the exponential fits to the raw hardness data, the EMC corrected hardness and the EMC corrected hardness with exclusions of measurements in FSC mode (Figure 6a-c) and measured by all methods (Figure 6d-f) for T91. The same analysis has been done for the sets of data from DSC and PMC measurement modes as well as for Eurofer97 (plots not shown).
The goodness of the fits has been evaluated by the standard error of the regression (reduced Chi-squared, χ 2 ) and used to assess the intra-laboratory data scatter. Inter-laboratory deviations have been assessed by the standard deviation, σ R , of the cross-correlation functions of hardness and reduced modulus given by Equations (7) and (8) for raw hardness and EMC corrected hardness respectively: Nanomaterials 2020, 10 11 of 15 Figure 6. Exponential fits to raw hardness data, EMC corrected hardness and EMC corrected hardness with exclusions of T91 measured in FSC mode (a, b, c) and by all methods combined (d, e, f).

Uncertainty analysis and effects of Elastic Modulus Correction
Accuracy achieved in deriving mechanical properties from nanoindentation measurements is affected by both random errors and systematic bias occurring either during the testing procedure or in the data analysis phase. Random errors, e.g., indenting on a pit in the sample surface, are difficult to correct, but can often be reduced by averaging many results; systematic error can appear as a measurement bias, which can be estimated and corrected for. The origins of systematic errors include laboratory-specific errors, such as inaccurate calibrations of force, displacement, frame compliance  Figure 6. Exponential fits to raw hardness data, EMC corrected hardness and EMC corrected hardness with exclusions of T91 measured in FSC mode (a-c) and by all methods combined (d-f). Table 2 lists the χ 2 values representing the goodness of the fits and the standard deviation of the hardness vs. modulus cross-correlations for FSC, DSC, PMC and all-methods data sets obtained for T91 and Eurofer97. In the case of the auto-correlations, in general the deviation decreases when the EMC correction is applied, except for the PMC method, which already presented a very small χ 2 in the raw data (χ 2 = 0.05). Applying the exclusions, the χ 2 values drastically decrease in all cases, the lower values being achieved by using the PMC method, both in T91 and in Eurofer97. Regarding cross-correlations, the EMC hardness presented more discrepancy as correlated to the reduced modulus (higher standard deviation of the hardness-reduced modulus cross-correlation function), while it significantly decreases by applying exclusions. Again, the PMC method presented the highest cross-correlation. Table 2. Standard error of the exponential fit regressions to raw hardness data and EMC corrected hardness, χ 2 , and standard deviation of cross-correlations between hardness and elastic modulus, σ R , for T91 and Eurofer97 obtained from indentations using different control measurement modes.

Uncertainty Analysis and Effects of Elastic Modulus Correction
Accuracy achieved in deriving mechanical properties from nanoindentation measurements is affected by both random errors and systematic bias occurring either during the testing procedure or in the data analysis phase. Random errors, e.g., indenting on a pit in the sample surface, are difficult to correct, but can often be reduced by averaging many results; systematic error can appear as a measurement bias, which can be estimated and corrected for. The origins of systematic errors include laboratory-specific errors, such as inaccurate calibrations of force, displacement, frame compliance and indenter tip shape, or material/sample-specific factors, such as surface and bulk residual stresses and pile-up or sink-in behaviour during indentation. The different sources of systematic error produce different effects in the measured indentation hardness and modulus. A blunt indenter tip calibrated correctly will produce the same modulus but a different hardness to a sharp tip. Compressive residual stresses (e.g., due to mechanical polishing) and pile-up behaviour produce an apparent increase in both hardness and modulus, whereas a too small frame compliance correction reduces modulus and hardness. A high surface roughness would tend to cause random uncertainty. It can cause a variable offset of the zero point from the average surface position of the sample and would cause a variation in the actual area of contact with the indenter. Depending on its lateral surface wavelength, roughness may cause increase or decrease in measured stiffness. When the indent size is smaller than the roughness wavelength, this results in indentation in hills or valleys, causing an offset in the contact depth and or stiffness measured due to local curvature of the sample surface. When the indent size is much greater than the roughness wavelength, the contact senses a less dense/low modulus surface layer of asperities before the onset of contact with fully dense material. Roughness, therefore, produces a complex series of conflicting effects on hardness and modulus parameters extracted from a nanoindentation test and this is why the standard ISO 14577 restricts indentation into surfaces to be where average roughness is less than 5% of the contact depth.
The elastic modulus correction applied in this study relies upon the assumption that there is no error in the measured stiffness. If this assumption holds, EMC reduces the error that compressive residual stresses and pile-up behaviour cause in the estimate of the area of contact. However, if an imprecise frame compliance correction has been applied, this is an error in the stiffness value measured and not in the area of contact; in this case the specific correction formula used will introduce a compensating error rather than a correction. The use of elastic modulus of a reference material to obtain the value of frame compliance in a nanoindenter is a standard procedure (see ISO 14577-2:2015) but this relies upon the opposite assumption, i.e., that the area of contact obeys the assumptions of the contact mechanics analysis being applied to calculate the contact stiffness and that the error is in the contact stiffness alone. While the raw hardness profiles of individual datasets (Figures 2 and 3) are consistent with a monotonic decrease of hardness with depth (ISE), there is a large variability amongst the different laboratories regarding hardness (ISE) and modulus profiles. In stiff indents, such as at high force or in material with high modulus-to-hardness ratio ( E √ H ), the force removal curve is very steep and small errors in frame compliance, force or displacement (e.g., due to vibrations, drift or creep) can cause large changes in the measured stiffness and result in a high uncertainty in E. However, the estimate of contact depth is little affected and so the uncertainty in H is low. The result is a large measurement variability in E and a low measurement variability in H. In this case, EMC generates a correction factor that compensates for and nulls out the random uncertainty in the modulus results. When this is applied to the hardness values, it significantly increases the scatter of data in the hardness profiles (Figures 4 and 5). This increase in random uncertainty can, however, be reduced or avoided by using averaged results. The standard error of an averaged stiffness measurement, even with large random uncertainty, is rapidly reduced by averaging a greater number of measurement results, which becomes easily possible by using the PMC method. Even though the in-house scatter of hardness results increased after EMC, in this case, the approach was necessary to account for the systematic offsets in the indentation results caused by pile-up and residual stresses that exist in both materials. Furthermore, changes of pile-up behaviour caused by irradiation, if not corrected for, can lead to ambiguous results when using nanoindentation to study irradiation induced hardening of materials of nuclear interest. Reduction of offsets between data sets through EMC, normalises the data into a single statistical population that can be used as a base to identify outliers, the exclusion of which largely reduces the overall variability, as revealed by the example statistical analysis performed ( Figure 6 and Table 2) and discussed below.

Statistical Analysis of Measurement Methods for Improved ISE Determination
The nanoindentation response of the two ferritic/martensitic steels resemble each other and so do the outcomes of the statistical analysis. In both cases, amongst the different testing methods PMC outperformed the two other methods in terms of goodness of fits, to the extent that already the regressions of the raw PMC datasets are better than the corrected FSC and DSC datasets. Likewise, the residual cross-correlations provide evidence that PMC performs much better than the single cycle methods. This is likely because in PMC, the depth dependence of hardness and modulus are measured at the same point on the surface of the specimen, while FSC and DSC probe different points of the surface to measure the hardness and modulus profiles. Thus, random errors introduced from point to point sample variability (such as surface inhomogeneity) will affect more the single cycle measurements and this may be reflected in the larger scatter and poorer cross-correlation of single cycle modes. In addition, the results for different depths in the case of the PMC method rely on the same zero-point determination, thus reducing the scatter related to uncertainties related to zero-point correction. Also contributing is the fact that more data points were obtained when using PMC (data was obtained at 10 depths because 10 cycles were applied, providing 10 averaged data points in the hardness and modulus profiles) as compared to FSC (5 data points in the hardness and modulus profiles) and DSC (7 data points). The standard error in a fit is reduced as the number of fitted data points increases.
For both materials, EMC hardness compilations where outliers have been removed exhibit significantly improved goodness of the exponential fits to the depth dependent hardness data as well as the residual cross-correlations between hardness and modulus. Therefore, the EMC offers a strong approach towards obtaining reliable hardness profiles to study and exploit ISE, in particular for materials amenable to pile-up or sink-in behaviour during the indentation process.

Conclusions
Nanoindentation has been widely used for qualitative purposes, e.g., comparative screening of materials. Special attention has to be paid to proper calibration of: force, displacement, frame compliance and indenter tip area; to a precision that would allow for quantitative hardness and elastic modulus determination. In particular, calibration of the frame compliance and the tip area function are critical for the present inter-laboratory comparison. While systematic errors associated with the correction of frame compliance are still to be considered, systematic errors originating from the projected contact area determination (tip area calibration, pile-ups and residual stresses) have been taken into consideration and significantly reduced by the application of an Elastic Modulus Correction, as evidenced by the statistical examination of hardness profiles showing improved goodness-of-fits and hardness-to-modulus cross-correlation when EMC is applied and used to identify outliers. The methodology provides a robust framework for the study of size dependent mechanisms of deformation based on nanoindentation testing.