Sample Sizes Based on Weibull Distribution and Normal Distribution for FRP Tensile Coupon Test

Current guidelines stipulate a sample size of five for a tensile coupon test of fiber reinforced polymer (FRP) composites based on the assumption of a normal distribution and a sample coefficient of variation (COV) of 0.058. Increasing studies have validated that a Weibull distribution is more appropriate in characterizing the tensile properties of FRP. However, few efforts have been devoted to sample size evaluation based on a Weibull distribution. It is not clear if the Weibull distribution will result in a more conservative sample size value. In addition, the COV of FRP’s properties can vary from 5% to 15% in practice. In this study, the sample size based on a two-parameter Weibull distribution is compared with that based on a normal distribution. It is revealed that the Weibull distribution results in almost the same sample size as the normal distribution, which means that the sample size based on a normal distribution is applicable. For coupons with COVs varying from 0.05 to 0.20, the sample sizes range from less than 10 to more than 60. The use of only five coupons will lead to a prediction error of material property between 6.2% and 24.8% for COVs varying from 0.05 to 0.20.


Introduction
Fiber reinforced polymer (FRP) composites have been increasingly used in the strengthening of engineering structures due to their advantages of high tensile strength, excellent corrosion resistance, light weight, and flexibility in shape [1][2][3][4][5][6][7]. Studies have validated that the application of FRP composites can improve the flexural capacity [8,9], stiffness [10], fatigue performance [11][12][13], and corrosion resistance [14] of structures. FRPs are also very attractive in shear strengthening and confinement [15,16]. In order to accurately evaluate the strengthening effects of FRP composites, it is necessary to first derive the valid data of FRP's properties. In addition, for structures strengthened with prestressed FRP composites, the acquisition of valid data is especially important, since the prestressing load might account for a non-negligible part of FRP's bearing capacity [11,[17][18][19]]. An inaccurate appraisal of FRP's tensile strength will expose the structures to higher risk of premature failure. Available guidelines stipulate a minimum value of five for FRP tensile coupon test [20][21][22][23]. This stipulation uses a normal distribution for the characterization of FRP's properties, and assumes a value of 0.058 as the sample coefficient of variation (COV) [21]. It is expected that with five coupons, there will be a 95% confidence that the relative error between the sample mean and the true mean value will be less than 5%.
Although a normal distribution is assumed in current guidelines for the characterization of FRP's tensile properties, many studies have revealed that a Weibull distribution is more appropriate as the describing model, especially for the tensile strength [24][25][26][27]. Zureick et al. compared the normal, log-normal, and Weibull distribution based on more than 600 samples, and recommended the Weibull distribution for the characterization of FRP's tensile strength and tensile modulus [24]. The same research team also studied the two-parameter and three-parameter Weibull distribution, and recommended the two-parameter model after taking into account the fitting goodness and the computational efficiency [28]. Gomes et al. conducted tensile tests on 1368 coupon samples [27]. It is confirmed that an overall good fit can be achieved by any of the normal, log-normal, or Weibull distribution. Despite this, the Weibull distribution provides the best prediction results in the tail region, and therefore is more appropriate as the modelling distribution. The inaccuracy of other models in the tail region prediction was also validated by Sanchez-Heres et al. [29]. Atadero compared the normal, lognormal, Weibull, and gamma distributions using more than 900 samples. The results showed that the Weibull distribution had a slight advantage in characterizing the tensile strength of FRP composites [25,30]. In addition to these experimental justifications, there are also some theoretical reasons for the use of Weibull distribution. The Weibull distribution is based on a weakest-link theory, which predicts that the failure of specimens is due to the weakest link (or the largest flaw) [31]. This agrees with the failure mechanism of FRP composites, and contributes to the use of Weibull distribution for property modelling. In comparison, the advantages of the normal distribution lie mostly in its ease of understanding and the availability of a closed-form analysis. It is symmetric and therefore is not suitable for the characterization of many engineering properties which show skewness to some extent. So far, the Weibull distribution has been adopted in the Composite Materials Handbook-MIL [32] and used by many researchers for the design strength (the 5th percentile value) analysis of FRP composites [24,27,33,34]. Nevertheless, limited efforts have been devoted to the sample size analysis using a normal distribution for mean value assessment. Bain proposed a method through which the sample size depends only on the accuracy and the percentile of the Weibull distribution [31]. Bain's study facilitated future research. However, the correspondence between the percentile and the mean value are not within the scope of Bain's study. In addition, only one-side confidence interval is presented by Bain, whereas an exact two-side confidence interval is not available. To the best knowledge of the authors, the sample size based on a Weibull distribution is still not available. It is not clear if the sample size provided by the Weibull distribution is more conservative than that by the normal distribution.
Besides the selection of the distribution type, the COV value assumed in current guidelines is also inappropriate. The available guidelines assume a maximum value of 0.058 as the sample COV. However, many studies show that the variation can be between 5% and 15% [33,[35][36][37]. The sample size of five based on an assumption of 0.058 cannot ensure the accuracy of samples when the sample COV is higher than 0.058.
This study aims to present a sample size analysis based on the Weibull and normal distribution. A confidence level of 95% and a relative error limit of 5% were used for analysis, in accordance with stipulations in current guidelines. The sample sizes based on the Weibull and normal distributions were presented and the conservativeness of the two distributions was compared. The effects of COV on the sample size determination were also revealed. In addition, the accuracy of using only five coupons for COVs ranging from 0.5 to 2.0 were analyzed. It is expected the sample size analysis will facilitate researchers and engineers in choosing the sample size for FRP tensile test and provide reference for specification in FRP guidelines. In addition, the maximum COV discussed by the authors is as high as 0.2. This will be beneficial for experiments exposed to severe environments, where the COVs of FRP's properties may be high [38].

Introduction to Weibull Distribution
In order to facilitate further discussion, it is necessary to first present an introduction to the Weibull distribution. The probability density function (PDF) of a two-parameter Weibull distribution is: where θ and β are termed as the scale and shape parameters, respectively. The corresponding cumulative distribution function (CDF) is: The mean value, µ, and the coefficient of variation, COV, of the Weibull distribution are: where Γ(·) is the gamma function. By virtue of Equation (2), we can have the p-percentile value

Percentile of the Mean Value
Bain proposed a method through which the sample size depends only on the p-percentile, the confidence level, and the desired confidence interval [31], as mentioned in the Introduction. Since we focus on the mean value, it is necessary to ascertain the percentile of the mean value. In engineering practice, we commonly use the arithmetic mean value X (estimated through the method of moments) as an approximation of the true mean value µ. However, if the properties conform to a Weibull distribution, a more exact estimation of µ is the maximum likelihood estimation (MLE) estimatorμ, the value of which is derived based on the MLE estimation of θ and β, denoted asθ and β [24]. The expression of MLE estimation for θ and β are as follows: where x i are the sample values, n is the sample size,θ andβ are the estimators of θ and β. Withθ andβ, the MLE estimationμ can be derived. The population distribution f (x), population COV, and p-percentile value x p can also be inferenced, denoted asf (x), CÔV, andx p , respectively.
Letμ be equal tox p . The percentile p of the estimated mean valueμ, denoted as p(μ), of the distributionf (x) can be determined: It can be seen that the estimatorp only depends onβ. In addition, it is noted from Equation (4) that the MLE estimate depends only onβ as well. Therefore, the percentilep of the estimated mean valueμ is directly related to CÔV. In other words, once the CÔV is derived, the percentile of theμ can be determined. Based on Equations (4) and (8), the correspondence between the CÔV,β, and the percentile p is presented in Table 1.

Sample Size Analysiŝ
µ is the inference on FRP's property based on the results from limited sample sizes. In order to presentμ within a certain confidence interval, we need to have the distribution ofμ (or its corresponding percentile valuex p ). This is illustrated in Figure 1. Bain constructs a pivotal quantity to solve such a problem [31]. The pivotal quantity U R is defined as follows: where n is the sample size, R = 1 − p and is defined as the reliability, andR is the estimate of R, which depends on the estimatedθ andβ. The one-side confidence limit U R, γ , for which P[U R < U R, γ ] = γ, or two-side confidence limits, for which P[U R, L < U R < U R, U ] = γ, only depends on the confidence level γ, the percentile p, and the sample size n. Through Monte Carlo simulation, Bain presented the one-side confidence limit U R, γ for a series of γ and p, with the results listed in the Chapter 4, Table 4 of Bain [31]. The two-side percentage points are not available in Bain's study. Therefore a Monte Carlo simulation was conducted by the authors to present the exact two-side confidence limits U R, L and U R, U , using a commercial software, MATLAB (MathWorks, Natick, MA, USA). It was assumed that the confidence level was γ = 0.95. The cases of CÔV ranging from 0.05 to 0.20 were analyzed. The results of two-side confidence limits are shown Table 2.    U R can also be expressed as a form related to x p : where x p,γ is the confidence limit ofx p for a given confidence level γ. For two-side confidence limit estimation, the lower and upper confidence limit are denoted as x p, L and x p, U . By transforming Equation (10), the following formula can be derived: With Equation (11) and the two-side percentage points of U γ listed in Table 2, we can have the x p,L /x p and x p,U /x p for γ = 0.95 and p equal to the values ofp listed in Table 1, as presented in Table 3.
Our purpose is to ascertain the sample size, so that there is a γ confidence level that the inferenced mean valuex p will be within certain prediction error. The relative error between the estimated mean valuex p and the true mean value µ can be expressed as (x p − µ)/µ . Since there is a γ confidence thatx p lies between x p,L and x p,U , the relative error ofx p at γ confidence level is within max x p, L /µ − 1 , x p, U /µ − 1 . As the true mean value µ of the population is not available, it is reasonable to substitute µ with its MLE estimationx p . Therefore, the relative error limit at a γ confidence level can be expressed as max x p, L /x p − 1 , x p, U /x p − 1 . By utilizing the data in Table 3, the sample size corresponding to various CÔVs within a prediction error of 5% and with a confidence level of 95% can be ascertained, as shown in Table 4. It is worth mentioning that, for engineering convenience, it is acceptable to use the sample COV (sample standard deviation divided by the sample mean) as the CÔV in Table 4, which will facilitate the calculation on the sample size for the FRP coupon test [24].

Sample Size Based on Normal Distribution
Let X 1 , X 2 , . . . , and X n be a sample taken from a normal distribution N (µ, σ 2 ), where n is the sample size, and µ and σ are the mean and standard deviation of the normal distribution. The statistic n follows a student's t distribution with n−1 degrees of freedom, where X is the sample mean, and S is the sample standard deviation. The student's t distribution is symmetric, and therefore the upper and lower confidence limit of T with a confidence level of γ (or 1−α, where α is termed as the significance level) are the 1−α/2 percentile value t 1−α/2 (n − 1) and α/2 percentile value t α/2 (n − 1), respectively. Let T = (X−µ) S/ √ n be equal to t 1−α/2 (n − 1), the following equation can be derived: This formula could be transformed as: where COV = S/X, representing the sample coefficient of variation. e = X − µ /X and denotes the relative error, or the accuracy. In fact, an exact equation for e is e = X − µ /µ. However, as µ is not available, a practical method is to substitute µ with X. Equation (13) is an implicit equation for sample size n, due to the fact that the value t 1−α/2 (n − 1) also depends on n. n can be determined through trial and error method. Table 5 presents the sample size corresponding to varied COVs, with a confidence level of 0.95 and a relative error limit of 5%. It is noted that all the values in Table 5 are higher than five, which is the value used in current guidelines. This result validates the risk in using only five coupons to derive the properties of FRP in tensile coupon test. In other words, the derived mean value based on five coupons might not meet the accuracy requirement of a relative error limit of 5%. It is especially interesting to note that even for a COV value of 0.05, the sample size, 7, is higher than the values used in current guidelines, 5. Theoretically, the sample size based on a COV of 0.05 should be less than the sample size stipulated in current guidelines, since the guidelines assume a slightly higher COV value, 0.058. A step-by-step explanation will be presented in Appendix A.
To reveal the prediction error if five coupons are used, Equation (13) is rearranged as the following: Based on Equation (14), the relative error limit of the derived mean value with various sample COVs can be illustrated (see Table 6):

Comparison and Recommendation
By comparing Table 4 with Table 5, it is found that the sample sizes based on the Weibull distribution and the normal distribution are almost the same, with the values based on the normal distribution being slightly larger. Strictly speaking, the sample sizes Table 4 cannot be directly compared with the values in Table 5, since the key parameter in Table 4 is the MLE estimator CÔV, whereas in Table 5 it is the sample standard deviation divided by the sample mean. However, since the CÔV can be substituted by the sample COV for engineering convenience [24], such a comparison makes sense to some extent. The similarity in the sample sizes based on the Weibull distribution and the normal distribution shows that the Weibull distribution does not lead to a more conservative estimation on the sample size, and that sample size based on a normal distribution is applicable.
For FRP coupon test, it is recommended that the sample sizes listed in Table 5 be used, i.e., 7 coupons for COV = 0.05, 18 coupons for COV = 0.10, etc. It is worth mentioning that if the coupons are with large COVs, the researchers should carefully check the fabrication and test procedure, rather than simply increasing the sample size to derive a more accurate property value. The large COVs can indicate problems with respect to the quality of fibers or resins, the impregnation procedure, the preparation and curation of specimens, the test setup, etc. Certain measures must be taken to correct the errors. The fabrication and test of samples should follow the procedure recommended by current guidelines [20][21][22][23].

Conclusions
This paper presents an analysis on the sample size for FRP coupon test. Both Weibull distribution and normal distribution were discussed with respect to the sample sizes corresponding to varied COVs. It was found that the sample size based on a Weibull distribution is almost the same as that based on a normal distribution (see Tables 4 and 5). In other words, the Weibull distribution does not lead to a more conservative result with respect to the sample size for derived property with required accuracy and confidence level. Specifically, according to Tables 4 and 5, the sample size is less than ten for a COV being 5% and more than 60 for a value after 20%. If only five specimens are used for tensile coupon test of FRP composites, the possible prediction error ranges from 6.2 to 24.8% when the COVs varies from 5% to 20%, which indicates that a minimum value of five cannot guarantee the accuracy for increased COVs.