Next Article in Journal
Impacts of Conventional and Agri-Food Waste-Derived Fertilizers on Durum Wheat Yield, Grain Quality, and Soil Health: A Two-Year Field Study in Greece and Southern Italy
Previous Article in Journal
Quantification of Caffeic Acid as Well as Antioxidant and Cytotoxic Activities of Ucuuba (Virola surinamensis) Co-Product Extract to Obtain New Functional and Nutraceutical Foods
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Method for Determining the Soil Shear Strength by Eliminating the Heteroscedasticity and Correlation of the Regression Residual

by
Heng Chi
1,2,*,
Hengdong Wang
2,
Yufeng Jia
1 and
Degao Zou
1
1
Institute of Earthquake Engineering, School of Infrastructure Engineering, Dalian University of Technology, Dalian 116024, China
2
Shanghai Municipal Engineering Design Institute (Group) Co., Ltd., Shanghai 200092, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(18), 10289; https://doi.org/10.3390/app151810289
Submission received: 13 June 2025 / Revised: 3 September 2025 / Accepted: 18 September 2025 / Published: 22 September 2025

Abstract

Due to cost and variability of geotechnical test results, the number of samples for geotechnical material parameters in one engineering project is limited, resulting in a certain degree of errors in the calculation of probability distribution, mean, and variance of mechanical parameters of the geotechnical materials. To improve the reliability of geotechnical engineering design, reducing the variance of shear strength is one of the methods. Currently, the least squares method is widely used to regress the shear strength of soil; however, the regression residuals often exhibit heteroscedasticity and correlation, which undermine the validity of the variance estimates of soil shear strength parameters. This study aims to address this issue by applying the generalized least squares method to eliminate the heteroscedasticity and correlation of regression residuals. The results of triaxial consolidated drained (CD) tests on the coarse-grained soil; triaxial unconsolidated undrained(UU), CD, and consolidated undrained (CU) tests on gravelly clay; and triaxial CD tests on sand were analyzed to estimate the mean and variance of their shear strength. The results show that while the mean values of shear strength parameters remain largely unchanged, the generalized least squares method reduces the standard deviation of cohesion by an average of 30.575% and that of the internal friction angle by 14.21%. This reduction in variability enhances the precision of parameter estimation, which is critical for reliability-based design in geotechnical engineering, as it leads to more consistent safety assessments and optimized structural designs. The reliability analysis of an infinitely long slope stability shows that the reliability index of the soil slope calculated by the traditional method is either large or small. The generalized least squares method, which eliminates the heteroscedasticity and correlation of the regression residuals, should be adopted to regress the shear strength of soil.

1. Introduction

The limit state design method based on reliability theory has been widely applied in structural engineering and has become a trend in the development of current geotechnical engineering design methods. However, due to the cost and variability of geotechnical test results, the number of samples of geotechnical parameters in one project is limited, and there are certain calculation errors in the determination of the probability distribution, mean value, and variance of soil shear strength parameters, making it difficult to grasp the parameter variability. Moreover, the variance of the mechanical parameters has a significant impact on the reliability of the structure. Therefore, improving the estimation accuracy of the variance of geotechnical parameters and accurately evaluating the variability of their mechanical parameters is a prerequisite for designing geotechnical engineering based on reliability theory.
In geotechnical engineering, the shear strength of soil is mostly determined by the Mohr–Coulomb strength criterion, with cohesion and friction angle being crucial design parameters. Even for soil under complex stress conditions or with different densities, shear strength can be described by establishing a relationship between cohesion and friction angle with the intermediate principal stress coefficient or density [1,2]. The correctness and discreteness of the cohesion and friction angle used in determining the bearing capacity of the foundation, checking the stability of the soil slope, and designing retaining structures all directly affect the safety and economy of the engineering project.
At present, most current standards adopt the partial factor limit state design method based on reliability theory for structural design. The material properties need to consider adverse variations on their characteristic values. The characteristic value is generally taken as a percentile value of the probability distribution of the material performance. The material strength is generally taken as a lower percentile value of the probability distribution, except for the mean values of the elastic modulus and Poisson’s ratio at the 0.5 percentile value. As is commonly employed internationally, the 0.05 percentile value is taken [3]. If the material strength follows a normal distribution, the standard value ( f k ) of the material strength is f k = μ f 1.645 σ f , where μ f and σ f are the mean and standard deviation of the material strength. According to relevant Chinese codes [4,5], the characteristic value for the strength of geotechnical materials can be determined using percentile-based methods. For instance, it may adopt the 0.1 percentile value or the mean of values below the overall mean, as specified in f k = μ f 1.28 σ f . According to the “Code for Investigation of Geotechnical Engineering” of China [6], the characteristic values of the shear strength of geotechnical materials are calculated as follows: ϕ k = γ s ϕ m and γ s = 1 1.704 / n + 4.678 / n 2 δ . ϕ m is the mean value of shear strength, and γ s is the statistical correction factor. n is the sample size. δ is the coefficient of variation of shear strength, and δ = σ / ϕ m . σ is the standard deviation of shear strength. Therefore, the determination of the standard deviation and the mean of shear strength is equally important for engineering applications.
At present, the shear strength of geotechnical materials is usually determined using a fitting regression method, and the most commonly used methods include the moment method, least squares method, point group center method, and optimal slope method [7,8,9,10]. Based on a great deal of sampling and experimentation, the application of the moment method or linear regression, two mathematical statistical methods, to determine shear strength is currently the most commonly used and accurate method [7,9]. However, in general, the sample size of each group of experiments is relatively small, and the moment method is greatly affected by experimental errors and human factors. The least squares linear regression method places the experimental points of each group in the same coordinate system for regression calculation, which not only effectively solves the problem of the sample size but also eliminates the errors contained in the cohesion and friction angle obtained from each group of experiments [7]. Although the linear regression method has the characteristics of a large sample size and a certain level of accuracy, the standard deviation obtained is not a true reflection of the variability of the actual shear strength, which has often been overlooked in previous calculations [7,9].
The shear strength parameters of the soil are estimated using the original least squares linear regression method [10,11]. The estimated values of these parameters are unbiased, but the variance estimates have errors. The main reason is that the regression equation does not meet the adaptation conditions of the least squares method. In other words, the residuals between the measured and predicted values of the regression equation have heteroscedasticity and correlation. Least squares regression requires the residuals to be independent and to follow a normal distribution with a mean of zero and equal variance under different confining pressures. However, the regression results of triaxial tests on soil samples indicate that the residuals in the regression equation have heteroscedasticity and are correlated with each other. The heteroscedasticity and correlation of regression residuals render parameter estimators ineffective as the variances are not constant and interrelated, rendering t-tests and F-tests ineffective. Due to the influence of the variance of each component of the parameter, the fluctuation between the estimated and true values of the parameter increases, which reduces the estimation accuracy, and as a result, the parameter variance estimated using the original least squares method is no longer the minimum variance estimate. Obviously, the variance estimation of the shear strength parameters is slightly larger to some extent, which can lead to a greater fluctuation in the calculation results of geotechnical structures, resulting in a decrease in the accuracy of the calculation in assessing the safety of geotechnical structures. This may also lead to economically inefficient designs of geotechnical structures.
Therefore, to enhance the reliability of safety assessment for geotechnical structures, one approach is to study the precise calculation method of the variance of soil shear strength parameters. In this paper, based on triaxial tests, the cohesion and friction angle of soil are linearly regressed using the classical least squares method. During the process, it was found that the regression residuals did not meet the assumptions of homoscedasticity and uncorrelatedness, which violated the application conditions of the least squares method. To address this issue, the square root matrix of the covariance matrix of the regression residuals was adopted to transform the regression equation, ensuring that it met the application conditions of the least squares method. The research flowchart is shown in Figure 1.

2. Method for Organizing the Shear Strength Parameters of Soil

The regression methods for organizing soil shear strength mainly include the volume stress–shear stress method ( p q method) and major principal stress–minor principal stress method ( σ 1 σ 3 method). Chen et al. [7] pointed out that both from theoretical derivation and analysis of the experimental results, it is clear that the shear strength parameter values obtained by these two methods are not the same, and they believe that the result of the σ 1 σ 3 method is more accurate. Therefore, here we only explain how to organize the shear strength parameters of the soil by the σ 1 σ 3 method.
The triaxial tests obtained the stress differences ( σ 1 σ 3 ) at which the specimens failed under various confining pressures σ 3 . According to the Mohr–Coulomb strength criterion, the failure line equation is as follows:
σ 1 = 2 c tan φ 2 + π 4 + σ 3 tan 2 φ 2 + π 4 = β 0 + β 1 σ 3
where σ 1 is the major principal stress at the time of sample failure; σ 3 is the minor principal stress, i.e., the sample confining pressure; c and φ are the cohesion and friction angle of soil’s shear strength; and β 0 and β 1 are undetermined coefficients. β 0 = 2 c tan φ 2 + π 4 , and β 1 = tan 2 φ 2 + π 4 .
It can be seen that regression Equation (1) cannot directly obtain the c and φ of soil’s shear strength, but they are determined by the intermediate variables of β 0 and β 1 . They are calculated as follows:
φ = 2 tan 1 β 1 0.5 π / 2 ,   and   c = β 0 2 β 1 0.5
where β 0 and β 1 are obtained through linear regression of experimental points ( σ 3 , σ 1 ) in Equation (1).
According to Equation (1) and probability theory, the following Equation (3) can be obtained through derivation:
σ σ 1 2 = σ 3 2 σ β 1 2 + σ β 0 2 + 2 σ 3 C o v β 0 , β 1
where σ σ 1 is the standard deviation of the major principal stress, and σ β 0 and σ β 1 are the standard deviations of the regression parameters β 0 and β 1 , respectively. C o v ( β 0 , β 1 ) is the covariance of the regression parameters β 0 and β 1 .
By performing Taylor expansion on Equation (2) at the mean values of β 0 and β 1 while ignoring second-order and higher-order infinitesimal quantities, the corresponding variance expressions for φ and c can be expressed as:
σ φ 2 = φ / β 1 2 σ β 1 2
σ c 2 = c / β 0 2 σ β 0 2 + c / β 1 2 σ β 1 2 + 2 c / β 0 c / β 1 C o v ( β 0 , β 1 )
C o v ( c , φ ) = φ / β 1 c / β 1 σ β 1 2 + φ / β 1 c / β 0 C o v ( β 0 , β 1 )
Based on (4), (5), and (6), the following expression can be obtained:
φ β 1 = 1 ( β 1 + 1 ) β 1 0.5 ,   c β 0 = 1 2 β 1 0.5 ,   and   c β 1 = β 0 4 β 1 1.5
Yu et al. [11] used the orthogonal least squares method to modify the shear strength parameters of the p q regression method for soil. They derived the revised regression coefficients. It has been theoretically proven that the shear strength parameter values obtained using the modified p q method are consistent with those obtained using the σ 1 σ 3 method, and a practical example has also verified that the c and φ values obtained using these two methods are the same. However, the impacts of the heteroscedasticity and correlation of the regression residuals of the σ 1 σ 3 method on the regression results were not considered. Chen et al. [9] used the weighted least squares method to organize the shear strength of 64 sets of 320 triaxial consolidated drained (CD) test results for the core wall material of the Xiaolangdi dam. The weighting coefficient used was calculated using the equation w i = 1 / ( α + β x i ) 2 , and the parameters α and β were obtained through linear regression of the residuals and independent variables. This is actually equivalent to partially considering the heteroscedasticity of the residuals without fundamentally solving the problem, and no discussion on eliminating the residual correlation has been published [12,13,14,15].

3. Original Least Squares Estimation

Assuming that y is the dependent variable, x is the independent variable that affects y , there is a linear relationship between x and y , and there are n sets of observed values ( x i , y i ) , i = 1 , 2 , 3 , , n , y can be expressed as:
y i = β 0 + β 1 x i + e i , i = 1 , 2 , , n
where e i is the residual term, which represents the influence of factors other than x on y and the experimental measurement error. β 0 and β 1 are unknown parameters that need to be estimated.
If the residual term e i satisfies the following conditions: (a) it is expected to be zero, E ( e i ) = 0 ; (b) equal variance V a r ( e i ) = σ 2 ; and (c) if there is no correlation ( C o v ( e i , e j ) = 0 , i j ) between the residuals, then the least squares estimation holds. This is the Gauss–Markov condition [16,17].
The matrix expression of the linear regression model can be expressed as:
y = X β + e , E ( e ) = 0 , C o v ( e ) = σ 2 I
where X = 1 1 1 x 1 x 2 x n , y = y 1 y 2 y n T , β = β 0 β 1 T , and T denotes the transpose of the matrix. I is the identity matrix, and σ 2 is the variance of the entire residual sequence.
The least squares method is used to obtain the estimated value of β , which minimizes the sum of the squared lengths of the error vector of e = y X β , i.e.,
Q ( β ) = y X β 2 = ( y X β ) T ( y X β )
Equation (11) is expanded, the partial derivative is calculated and set to zero, and the parameters to be estimated are calculated as:
β = ( X T X ) 1 X T y
By substituting n sets of observation data x i , y i into the above Equation (12), the estimated value of β , i.e., β ^ , is obtained as:
β ^ = ( X T X ) 1 X T y
The covariance matrix of the parameter β ^ can be expressed as:
C o v ( β ^ ) = σ 2 X T X 1

4. Triaxial Tests and Results

Large triaxial consolidation drained (CD) tests were conducted on the coarse-grained soil of the R engineering project, with a total of 11 groups of tests selected based on sample availability and consistency in dry density. Meanwhile, small triaxial consolidation drained (CD), unconsolidated undrained (UU), and consolidated undrained (CU) tests were performed on the gravelly clay of the S engineering project. Due to variations in sample dry density, only 7 to 8 groups with similar densities were used for analysis. Similarly, 6 groups of small triaxial CD tests were conducted on the sand of the T engineering project. The number of test groups for each material was determined to ensure statistical reliability while accounting for practical constraints in sample preparation and testing. The large and small triaxial testing instruments are shown in Figure 2. The sample diameter of the large-scale triaxial test was 300 mm, and the height was 600 mm. The small triaxial apparatus can conduct triaxial tests with two different diameters. For the S project gravelly clay, the CD, UU, and CU tests had a sample diameter of 61.8 mm and a height of 150 mm. For the T project sand triaxial test, the sample diameter was 100 mm, and the height was 200 mm.
Eleven groups of triaxial test data for coarse-grained soil for Project R were used, and the confining pressures of each group were set to six levels. Considering the nonlinearity of the shear strength of the coarse-grained soil, the Mohr–Coulomb shear strength was divided into multiple linear sections. Here, only the shear strength of the low confining pressure section with three levels of confining pressure (100 kPa, 300 kPa, and 500 kPa) is discussed. Eleven groups of triaxial UU, CD, and CU tests were also carried out on the gravelly clay for Project S. However, due to certain differences in the dry density of the samples, seven or eight groups of test data with similar densities were selected for calculation. The confining pressures for each group of tests were 100 kPa, 300 kPa, and 500 kPa. Six groups of CD tests were carried out on sand for Project T, and the confining pressure of each group of tests was also divided into three levels, i.e., 200 kPa, 400 kPa, and 800 kPa. The major principal stress when the samples failed under different confining pressures for each soil material was obtained from the triaxial tests and is shown in Table 1.

5. Test for Homogeneity of Variance and Correlation of Residuals in Regression Analysis

Strictly speaking, the homogeneity of the variance and independence of the regression residuals under different confining pressures should be determined through hypothesis testing methods. However, due to article length limitations and because this is not the main focus of this paper, we will not elaborate on it here. A simpler method is to determine the homogeneity of the variance and independence of the residuals by checking whether the principal diagonal elements of the covariance matrix are equal and whether the off-diagonal elements are zero. If all of the diagonal elements of the covariance matrix are equal, the regression residuals have homogeneity of variance; otherwise, they have heteroscedasticity. If the non-diagonal elements of the covariance matrix are not all zero, then the regression residuals are correlated; otherwise, the residuals are uncorrelated.
Based on the results of triaxial tests, the principal stress at the failure points of soil samples under different confining pressures is determined, and n sets of regression data of ( σ 3 , σ 1 ) are generated. Then, the original least squares method is used to regress the relationship between the major and minor principal stresses at the point of failure, and the regression coefficients β 0 and β 1 are obtained. Then, the cohesion and friction angle of the soil are calculated using Equation (2).
Figure 3 shows the original least squares regression of the triaxial CD tests for the coarse-grained soil of Project R, the gravel clay of Project S, and the sand of Project T. After obtaining the regression equation, the difference between the measured values of the principal stress at failure and the values predicted via the regression equation is calculated and taken as the residual.
Figure 4 shows the regression residuals in Figure 3. Figure 5 shows the original least squares regression of the triaxial CU and UU tests for the gravel clay of Project S, and Figure 6 shows the regression residuals of the triaxial CU and UU tests for the gravel clay of Project S.
The covariance matrix of the residuals under different confining pressures is calculated for the three types of soil materials and five different tests as follows: sort out the residuals of the same soil material under different confining pressures, treat them as vectors, calculate the variance and covariance of vectors, and form a covariance matrix. For the CD test of the coarse-grained soil, the covariance matrix of the original least squares linear regression residuals can be expressed as:
C o v ( e i , e j ) = 4416.874 7868.913 6083.824 7868.913 29513.526 15243.505 6083.824 15243.505 22708.444 = σ 2 V
where σ 2 = 808011.978 .
Based on Equation (15), the following expression can be obtained:
V = 547 974 753 974 3653 1887 753 1887 2810 × 10 5
For the CD test of the gravelly clay, the covariance matrix of the original least squares regression residuals can be expressed as:
C o v ( e i , e j ) = 1080.945 3119.090 1978.745 3119.090 11316.455 6791.522 1978.745 6791.522 5915.705 = σ 2 V
where σ 2 = 179093.412 .
Based on (17), the following expression can be obtained:
V = 604 1742 1105 1742 6319 3792 1105 3792 3303 × 10 5
For the UU test of the gravel clay, the covariance matrix of the residual of the original least squares regression can be expressed as:
C o v ( e i , e j ) = 6658.055 4966.479 8975.409 4966.479 5291.741 6495.849 8975.409 6495.849 14272.594 = σ 2 V
where σ 2 = 114301.133 .
Based on (19), the following expression can be obtained:
V = 5825 4345 7852 4345 4630 5683 7852 5683 12487 × 10 5
For the UU test of the gravelly clay material, the covariance matrix of the original least squares regression residuals can be expressed as:
C o v ( e i , e j ) = 9730.396 1323.982 51.0198 1323.982 1628.525 2237.728 51.0198 2237.728 6342.379 = σ 2 V
where σ 2 = 151057.858 .
Based on (21), the following expression can be obtained:
V = 6442 876 34 876 1078 1481 34 1481 4199 × 10 5
For the CD test of the sand, the covariance matrix of the residuals of the original least squares linear regression can be expressed as:
C o v ( e i , e j ) = 6007.644 6041.93 9745.188 6041.93 19186.862 20409.064 9745.188 20409.064 28520.592 = σ 2 V
where σ 2 = 757235.266 .
Based on (23), the following expression can be obtained:
V = 793 798 1287 798 2534 2695 1105 2695 37667 × 10 5
As can be seen from Figure 4 and Figure 6, the distribution of the regression residuals is not the same under different confining pressures. For the same soil material and test, some of the residual points of the confining pressure are relatively concentrated, while others are scattered, indicating that the residuals under different confining pressures do not have homogeneity of variance. Based on the regression covariance matrices of the different types of soil test results, the covariance matrix is full rank, the diagonal elements are unequal, the non-diagonal elements are not zero, and the non-diagonal elements are basically of the same order of magnitude as the diagonal elements, indicating that the residuals not only have heteroscedasticity but also correlation.
The above examples illustrate that for coarse-grained soil, sand, and cohesive soil, for both the effective stress shear strength of the CD test and the total shear strength of the UU and CU tests, when using the original least squares method for soil shear strength parameter regression, the residuals have heteroscedasticity and correlation. If the original least squares method is used to conduct the estimation, its parameters will no longer be valid, and the significance tests of variables and the model predictions will all suffer certain degrees of failure. Therefore, efforts should be made to improve the original least squares regression method to eliminate the heteroscedasticity and the correlation of the regression residuals.

6. Eliminating Residual Variance Heterogeneity and Correlation Using the Generalized Least Squares Method

The linear regression equation of the σ 1 σ 3 relationship can be expressed as [16,17]:
y = X β + e , E ( e ) = 0 , C o v ( e ) = σ 2 V
where V is non-singular and positive definite, and there is a non-singular symmetric matrix K , so K T K = K 1 K = V . Matrix K is the square root matrix of V , and K T and K 1 are the transpose matrix and inverse matrix of K .
Multiplying both sides of the regression equation y = X β + e by K 1 at the same time yields K 1 y = K 1 X β + K 1 e .
z = K 1 y , B = K 1 X , g = K 1 e
Thus, the regression variation can be expressed as:
z = B β + g
After this process, the error of the model has a mean value of zero, i.e., E ( g ) = K T E ( e ) = 0 , and the covariance matrix of g can be expressed as:
V a r ( g ) = g E ( g ) g E ( g ) T = E g g T = E ( K 1 e e T K 1 ) = K 1 E ( e e T ) K 1 = σ 2 K 1 V K 1 = σ 2 K 1 K K K 1 = σ 2 I
Therefore, the element of g has the properties of a mean value of zero, constant variance, and no correlation. The test conditions of the original least squares estimation are satisfied, and the least squares function can be expressed as [16,17]:
Q ( β ) = g T g = e T V 1 e = ( y X β ) T V 1 ( y X β )
Its canonical Equation (30) can be obtained as:
X T V 1 X β ^ * = X T V 1 y
The solution of this Equation can be obtained as:
β ^ * = ( X T V 1 X ) 1 X T V 1 y
The covariance of β ^ * can be expressed as:
V a r ( β ^ * ) = V a r ( ( X T V 1 X ) 1 X T V 1 y ) = ( X T V 1 X ) 1 X T V 1 σ 2 V V 1 X ( X T V 1 X ) 1 = σ 2 ( X T V 1 X ) 1

7. The Shear Strength Parameters of Soil

The covariance matrix of the regression parameters β 0 and β 1 of the original least squares method is obtained under the Gauss–Markov condition. However, when regressing the σ 1 σ 3 relationship of the soil triaxial test using the original least squares method, its residual does not meet the Gauss–Markov condition, and its covariance matrix of parameter β ^ can be modified as:
C o v ( β ^ ) = σ 2 X T X 1 X T V X X T X 1
The friction angle and cohesion of the soil can be calculated using Equation (2) after β 0 and β 1 are calculated using the original and generalized least squares methods according to Equations (13) and (31). Then, the variance and covariance of the friction angle and cohesion are calculated using Equations (4)–(7). The specific calculations are presented in Table 2 and Table 3. Table 2 shows the regression parameters and their variance calculation. It can be seen that both the variances of β 0 and β 1 and their covariance between β 0 and β 1 are smaller for the generalized least squares method than the original least squares method, indicating that the generalized least squares method is better than the original least squares method [18]. Based on the above analysis, the comparative differences between the ordinary least squares (OLS) and generalized least squares (GLS) methods are illustrated in Figure 7.
It can be seen from Table 3 that for all three soil materials (i.e., coarse-grained soil, sand, and gravelly clay) and for all of the test types (i.e., UU, CD, and CU triaxial tests and large and small triaxial tests), the means of the soil’s shear strength obtained using the generalized least squares method eliminates the heteroscedasticity and correlation of the residuals, which is not significantly different from that obtained using the original least squares method, but the standard deviation of the shear strength obtained using the generalized least squares method are generally smaller. The generalized least squares method improves the accuracy of the soil shear strength estimation, which is of great significance for accurately determining the safety of geotechnical engineering projects.
There are two points to be noted in the calculation results, as follows:
(a)
As can be seen from Table 3, for the cohesion from the small triaxial CU test on the gravelly clay in the S project, there is a certain difference between the results of the two calculation methods, with a cohesion value of 69.995 kPa obtained using the original least squares method and its corresponding value of 96.366 kPa obtained using the generalized least squares method. This is because the test has a low outlier value at 100 kPa confining pressure (Figure 5). In mathematics, the original least squares method takes the minimum residual error as the objective function, and this low outlier value will inevitably lower the regression intercept of β 0 , the original least squares β 0 = 211.3 , and the generalized least squares of β 0 = 282.3 . Therefore, the robust regression of the soil shear strength by the generalized least squares method is better than that of the original least squares method. According to Equation (2), the cohesion of soil is directly proportional to the regression intercept β 0 , so the cohesion determined using the original regression least squares method is low.
(b)
As can be seen from Table 3, for the standard deviation of the cohesion of the sand for Project T, there is little difference between the values obtained using the generalized and original least squares regression methods, with values of 17.14 and 17.20 kPa, respectively, but the value obtained from the generalized least squares method is slightly larger. This is because there is little difference between the two methods regarding the regression parameter β 0 , with values of 4026.72 and 3950.93 for the generalized and original least squares regression methods, respectively. The variance of the regression parameter β 0 obtained using the generalized least squares method is lower. However, due to the inconsistency between the regression parameters and the variance conversion multiple of the cohesion, the final standard deviation of the cohesion is slightly larger.
Table 4 and Table 5 shows the variation in the standard deviation of the Mohr–Coulomb shear strength between the original and generalized least squares methods. It can be seen that except for the 0.4% increase in the standard deviation of the cohesion of the sand, the other standard deviations are reduced. The standard deviation of the cohesion is reduced by 28–49.3% (average of 30.575%), and the standard deviation of the friction angle is reduced by 0.25–40.1% (average of 14.21%). It can be seen that the generalized least squares method eliminates the heteroscedasticity, and the correlation of the residuals has an obvious effect on reducing the standard deviation of the cohesion and frictional angle.

8. The Reliability of Slope Stability Under Different Standard Deviations of Shear Strength

For an infinitely long slope, whose angle with the horizontal plane is α , the slope ratio is 1:m, the unit weight of the soil is γ , the thickness of the soil layer H is equal to 3 m, and the underlying layer is bedrock, the sliding surface slides within the soil layer along the bedrock surface. As shown in Figure 8, the soil layer shear strength and its standard deviation obtained from the literature are used to analyze and compare the reliability index of the slope stability.
The anti-sliding stability safety factor can be expressed as:
F = W cos α tan φ + c / cos α W sin α = tan φ tan α + c γ H sin α cos α
on account of
m = 1 tan α ,   2 sin α cos α = sin 2 α = 2 tan α tan 2 α + 1 = 2 / m 1 / m 2 + 1 = 2 m ( 1 / m 2 + 1 ) F = m tan φ + c m 1 m 2 + 1 γ H = m tan φ + c ( m 2 + 1 ) γ H m
If ( m 2 + 1 ) γ H m = B , f = tan φ .
The safety factor expression has changed to:
F = m f + B c
The mean and variance of the safety coefficients can be expressed as, respectively:
μ F = m μ f + B μ c
D ( F ) = m 2 D ( f ) + B 2 D ( c ) + 2 m B C o v ( f , c )
σ F = D ( F S )
In which, μ F , μ f , and μ c represent the mean of the safety factor, the friction coefficient, and the cohesion, respectively; D ( F ) , D ( f ) , and D ( c ) represent the variance of the safety factor, the friction coefficient, and the cohesion, respectively; C o v ( f , c ) is the covariance of the friction coefficient and the cohesion; and σ F is the standard deviation of the safety factor.
Since f = tan φ , the variance and covariance conversion relationship between the friction coefficient and the internal friction angle can be expressed as:
σ f 2 = f φ 2 σ φ 2 = sec 2 ( μ φ ) σ φ 2
C o v ( f , c ) = sec 2 ( μ φ ) σ φ 2
The limit state Equation (42) for slope stability can be expressed as:
Z = F 1 = 0
The corresponding reliability indicator can be expressed as:
β = μ F 1 σ F
Based on the results of the CD tests for coarse-grained soil in Project R, the CD tests for sand in Project T, and the UU tests for gravelly clay in Project S, the soil shear strength parameters obtained through the original and generalized linear regressions were used. Now, assuming that the slope ratios of the infinitely long slopes are 1:1.3, 1:2.0, and 1:2.0, respectively, the reliability indices of the slopes were calculated. Engineering designers generally believe that coarse-grained soils lack fine-grained cohesive soil and thus, the cohesion is unreliable. Therefore, for the coarse-grained soil materials in Project R, their cohesion is not considered. The calculation results are shown in Table 6.
It can be seen that the calculated reliability index of the slope stability using the original and generalized linear regression methods for obtaining the soil shear strength parameters is not the same. For the test results of the coarse-grained soil in Project R and the sand in Project T, the reliability index of the soil slope calculated using the generalized linear regression method to obtain the shear strength parameters is smaller than that using the original linear regression method. This indicates that the original linear regression overestimated the shear strength of the soil, and the reliability index was overestimated by 4.418%. Thus, using the original regression algorithm to estimate soil shear strength, the slope stability may lead to an overly risky engineering design.
However, for the S project’s gravelly clay, the opposite is true. That is, using the original least squares method for linear regression, the obtained parameters of the soil’s shear strength are underestimated, leading to an underestimation of the slope stability by 16.718% of the reliability index. This naturally results in certain engineering waste. Therefore, the original least squares method may lead to either overestimation or underestimation of slope reliability, resulting in either unsafe designs or unnecessary conservatism. The reduction in standard deviation achieved by the generalized least squares method directly translates into more stable and predictable reliability indices, enhancing the safety and economic efficiency of geotechnical designs. Thus, the generalized least squares method, which eliminates the heteroscedasticity and correlation of regression residuals, should be adopted for regressing soil shear strength parameters.

9. Conclusions

The shear strength of soil is an important parameter in geotechnical engineering. The selection of the cohesion and friction angle has a direct impact on the bearing capacity of the foundation, the slope stability, and the design of retaining structures. In particular, the design of geotechnical engineering based on reliability theory has become a new trend. However, due to the cost and dispersion of soil test results, there are very few samples of soil parameters in one engineering project, and the calculation errors of the probability distribution, mean, and variance of the parameters are large, making it difficult to grasp the variability of the parameters. The key point is that the variance of soil mechanical parameters has a significant impact on the reliability of the structure. Therefore, it is imperative to improve the estimation accuracy of the variance of the shear strength of the soil.
Currently, the shear strength is usually determined by the moment method or the linear regression method. However, the moment method is greatly affected by experimental errors and human factors. The linear regression method has the advantages of a large sample size and high accuracy. However, the estimated parameter variance of this method is not unbiased. The main reason for this is that the regression residuals have heteroscedasticity and correlation, which do not meet the conditions required for the use of the linear regression least squares method.
In order to eliminate the heteroscedasticity and the correlation of the regression residuals, a generalized least squares regression method for soil shear strength was developed in this study. Based on triaxial CD tests on coarse-grained soil and sand and UU, CD, and CU tests on gravelly clay, the shear strength parameters and their variances are estimated. It was found that the generalized least squares method significantly reduces the variance of shear strength parameters while maintaining comparable mean values. Specifically, the standard deviation of cohesion decreased by an average of 30.575%, and that of the friction angle by 14.21%. This reduction in variability is not merely statistical—it directly improves the reliability of geotechnical designs by reducing uncertainty in safety assessments, thereby supporting more economical and safe engineering decisions.
The analysis of the reliability of the slope stability of an infinitely long slope reveals that the calculation of the reliability index of the slope using the generalized least squares method for the soil shear strength parameters is inconsistent with that of the original method. The reliability index may increase or decrease. In this example, the slope stability reliability index decreased by 4.428% and increased by 16.718% with different parameters. It can be seen that the original least squares method’s regression of the soil shear strength and its variance is not accurate enough, which may lead to deviations in the reliability assessment of geotechnical engineering. Therefore, the generalized least squares method, which eliminates the heteroscedasticity and correlation of the regression residuals, should be used to regress the soil shear strength.
Despite the effectiveness of the generalized least squares (GLS) method in reducing the variance of shear strength parameters, several limitations should be noted. The method requires prior knowledge of the residual covariance structure, which may not be readily available in practical applications and must be estimated from limited data, potentially introducing additional uncertainty. Future research should focus on developing more robust methods for estimating the covariance matrix, especially for small sample sizes, and exploring the applicability of GLS in nonlinear strength models and under complex stress paths. Additionally, the integration of Bayesian methods or machine learning techniques could further enhance the reliability of parameter estimation and provide a more comprehensive uncertainty quantification.

Author Contributions

H.C.: writing—original draft, software, methodology, investigation, conceptualization, and data curation. Y.J.: supervision, funding acquisition, and formal analysis. H.W.: writing—original draft, supervision, methodology, and conceptualization. D.Z.: supervision, methodology, and investigation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by [the National Science Foundation of China] grant number [52379116] and the Shanghai Craftsman Innovation Studio (Wang Hengdong).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Chi Heng was employed by Shanghai Municipal Engineering Design Institute (Group) Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare no conflict of interest.

Abbreviations

The abbreviations and symbols in this paper are presented below.
μ f Mean value of soil material strength
σ f Standard deviation of soil material strength
ϕ m Mean value of soil shear strength
γ s Statistical correction factor
n Number of samples
δ Variation coefficient of shear strength
c Cohesion
φ Internal friction angle
σ 1 Maximum principal stress at specimen failure
σ 3 Minimum principal stress at specimen failure
σ σ 1 Variance of maximum principal stress
σ β 0 Standard deviation of β 0
σ β 1 Standard deviation of β 1
e i Residual term
I Identity matrix
σ 2 Variance of the entire residual sequence
β 0 Least squares method for estimating parameters
β ^ Estimated value of β
K Square root matrix of V
K T Transposition matrix of K
K 1 Inverse matrix of K

References

  1. Bala, B.; Andy, A.B. Modeling of soil shear strength using multiple linear regression (MLR) at Penang, Malaysia. J. Eng. Res. 2021, 9, 40–51. [Google Scholar]
  2. Tian, D.S.; Zheng, H. The Generalized Mohr-Coulomb Failure Criterion. Appl. Sci. 2023, 13, 5405. [Google Scholar] [CrossRef]
  3. GB 50068-2018; Unified Standard for Reliability Design of Building Structures. Ministry of Housing and Urban-Rural Development of the People’s Republic of China; Standardization Administration of the People’s Republic of China; Architecture & Building Press: Beijing, China, 2018. (In Chinese)
  4. GB 5099-2013; Unified Standard for Reliability Design of Hydraulic Engineering Structures. Ministry of Water Resources of the People’s Republic of China; China Water & Hydropower Press: Beijing, China, 2013. (In Chinese)
  5. NB/T 10872-2021; Design Code for Rolled Earth-rock Fill Dams. National Energy Administration of the People’s Republic of China; China Electric Power Press: Beijing, China, 2021. (In Chinese)
  6. GB 50021-2001; Code for Investigation of Geotechnical Engineering (2009 Edition). Ministry of Housing and Urban-Rural Development of the People’s Republic of China; China Architecture & Building Press: Beijing, China, 2001. (In Chinese)
  7. Chen, L.; Chen, Z.; Li, G. Discussion of linear regression method to estimate shear strength parameters from results of triaxial tests. Rock Soil Mech. 2005, 26, 1785–1789. [Google Scholar]
  8. GB/T 50123-2019; Standard for Geotechnical Testing Methods. Ministry of Housing and Urban-Rural Development of the People’s Republic of China; China Architecture & Building Press: Beijing, China, 2019. (In Chinese)
  9. Chen, L.; Chen, Z.; Li, G. A modified linear regression method to estimate shear strength parameters. Rock Soil Mech. 2007, 28, 1421–1426. [Google Scholar]
  10. Karol, B.; Kazimierz, J.; Artur, Z. On the interpretation of shear parameters uncertainty with a linear regression approach. Measurement 2021, 174, 108949. [Google Scholar] [CrossRef]
  11. Yu, D.; Yao, H.; Wu, S. Difference and modification of regression analysis methods to estimate shear strength parameters obtained by triaxial test. Rock Soil Mech. 2012, 33, 3037–3042. [Google Scholar]
  12. Phoon, K. Role of reliability calculations in geotechnical design. Georisk 2017, 11, 4–21. [Google Scholar] [CrossRef]
  13. Tomobe, H.; Fujisawa, K.; Murakami, A. A Mohr-Coulomb-Vilar model for constitutive relationship in root-soil interface under changing suction. Soils Found. 2021, 61, 815–835. [Google Scholar] [CrossRef]
  14. Zambrano, M.; Valko, P.; Russell, J. Error in variables for rock failure envelope. Int. J. Rock Mech. Min. Sci. 2003, 40, 137–143. [Google Scholar] [CrossRef]
  15. Zhao, L.H.; Cheng, X.; Dan, H.C.; Tang, Z.P.; Zhang, Y. Effect of the vertical earthquake component on permanent seismic displacement of soil slopes based on the nonlinear Mohr–Coulomb failure criterion. Soils Found. 2017, 57, 237–251. [Google Scholar] [CrossRef]
  16. Douglas, C.; Mentgomery, E.; Peck, G. Introduction to Linear Regression Analysis, 5th ed.; China Machine Press: Beijing, China, 2022. (In Chinese) [Google Scholar]
  17. Wang, G.; Chen, M.; Chen, L. Linear Statistical Model—Linear Regression and Variance Analysis; Beijing Higher Education Press: Beijing, China, 2021. (In Chinese) [Google Scholar]
  18. Lai, Y.; Gao, Z.; Zhang, S.; Chang, X. Stress-strain relationships and nonlinear mohr strength criteria of frozen sandy clay. Soils Found. 2010, 50, 45–53. [Google Scholar]
Figure 1. The research flowchart.
Figure 1. The research flowchart.
Applsci 15 10289 g001
Figure 2. Used testing instruments: (a) large-scale triaxial apparatus; (b) small triaxial apparatus.
Figure 2. Used testing instruments: (a) large-scale triaxial apparatus; (b) small triaxial apparatus.
Applsci 15 10289 g002
Figure 3. The original least squares regression graph of the triaxial CD tests for various soil materials.
Figure 3. The original least squares regression graph of the triaxial CD tests for various soil materials.
Applsci 15 10289 g003
Figure 4. The original least squares regression residual graph of the triaxial CD tests for various soil materials.
Figure 4. The original least squares regression residual graph of the triaxial CD tests for various soil materials.
Applsci 15 10289 g004
Figure 5. The original least squares regression graph of the triaxial CU and UU tests for gravelly soil.
Figure 5. The original least squares regression graph of the triaxial CU and UU tests for gravelly soil.
Applsci 15 10289 g005
Figure 6. The original least squares regression residual graph of the triaxial CU and UU tests for gravelly soil.
Figure 6. The original least squares regression residual graph of the triaxial CU and UU tests for gravelly soil.
Applsci 15 10289 g006
Figure 7. Flowchart of the analysis procedures for the ordinary least squares (OLS) and generalized least squares (GLS) methods.
Figure 7. Flowchart of the analysis procedures for the ordinary least squares (OLS) and generalized least squares (GLS) methods.
Applsci 15 10289 g007
Figure 8. An infinitely long soil slope on the bedrock.
Figure 8. An infinitely long soil slope on the bedrock.
Applsci 15 10289 g008
Table 1. The major principal stress at failure during the large and small triaxial tests.
Table 1. The major principal stress at failure during the large and small triaxial tests.
Materials and TestsConfining Pressure/kPaThe Major Principal Stress at Failure/kPa
1234567891011
Coarse-grained soil CD test1001208.071433.21370.691295.481351.31291.341333.81374.321352.261379.371395.21
3002182.252532.542636.462459.262413.022374.562303.872375.562534.862637.452821.11
5003116.993437.183534.053397.873501.243520.253550.833616.593396.873613.273665.85
Gravelly clay
UU test
100600.535538.425526.378628700.965701.966728.844----
3001028.731010.78946.91496611271063.571132.24----
5001319.451287.471317.5113661456.4315651570.54----
Gravelly clay
CD test
100486.023513.984517.097488477.826487.572513415.09---
3001037.591065.961010.189771012.55993.759997.792722.93---
5001576.081548.961560.591497.721453.121445.5214641343.08---
Gravelly clay
CU test
100203.34461465469434.512470.72472.292----
300891.633992.89959887910.829889.386928.74----
50013441475.22130612461263.6513201400.19----
Sand
CD test
200816.6930.9799.8759.4766.9700-----
4001756.51612.31558.21551.413681422 --
8002999.32994.52917.72710.72606.32706.5-----
Table 2. Least squares regression parameters of β 0 and β 1 and their variance.
Table 2. Least squares regression parameters of β 0 and β 1 and their variance.
Materials, TestMethod β 0 β 1 Variance   of   β 0 Variance   of   β 1 Covariance   of   β 0   and   β 1 Heteroscedasticity of Residual VarianceResidual Autocorrelation
Coarse-grained soil, CDoriginal829.815.35598361.8660.0935−6.657existsexists
generalized807.245.33963940.6040.0923−4.346eliminatedeliminated
Gravelly clay, UUoriginal443.041.9496206.010.02532.767existsexists
generalized448.411.9774220.190.00940.272eliminatedeliminated
Gravelly clay, CDoriginal234.452.4972132.800.01901.3909existsexists
generalized240.552.504429.450.0166−0.6206eliminatedeliminated
Gravelly clay, CUoriginal211.32.278317,806.650.1376−44.09existsexists
generalized282.32.14537198.300.0805−21.37eliminatedeliminated
Sand, CDoriginal156.723.3524026.720.03612.2829existsexists
generalized149.423.2893950.930.03041.6266eliminatedeliminated
Table 3. Soil cohesion and friction angle, and their variance, standard deviation, and coefficient of variation.
Table 3. Soil cohesion and friction angle, and their variance, standard deviation, and coefficient of variation.
Materials ,   Test Method c /kPaFriction Angle (°)Variance of cVariance of φ (10−2 Radians)Covariance of c and φStandard Deviation of cStandard Deviation of φ (°)Coefficient of Variation of cCoefficient of Variation of φ
Coarse-grained soil, CDoriginal179.2843.26464.640.0432−0.20421.561.1910.1200.028
generalized174.6743.20239.940.043−0.16715.491.1880.0890.028
Gravelly clay, UUoriginal158.6718.77757.380.15−0.01027.522.220.1730.118
generalized159.4619.16541.330.05−0.06823.271.330.1460.070
Gravelly clay, CDoriginal74.1825.35204.650.060.028614.3061.4290.1930.0564
generalized76.0025.4252.650.05−0.0817.2561.3320.0950.0523
Gravelly clay, CUoriginal69.99522.952435.10.56−3.37849.3464.2950.7050.187
generalized96.36621.351207.10.38−2.27834.7433.5280.3610.165
Sand, CDoriginal42.80032.71293.80.060.049417.141.3660.4010.042
generalized41.19732.25295.90.050.033217.201.2840.4180.040
Table 4. Percentage reduction in the standard deviation of the cohesion and frictional angle after eliminating the heteroscedasticity and correlation of the regression residual.
Table 4. Percentage reduction in the standard deviation of the cohesion and frictional angle after eliminating the heteroscedasticity and correlation of the regression residual.
MethodGravelly Clay, UUGravelly Clay, CDGravelly Clay, CU
OriginalGeneralizedVariation/%OriginalGeneralizedVariation/%OriginalGeneralizedVariation/%
c (kPa)27.5223.2715.414.3067.25649.349.34634.74329.6
φ (°)2.221.3340.11.4291.3326.84.2953.52817.9
Table 5. Percentage reduction of the standard deviation of the soil shear strength after eliminating the heteroscedasticity and correlation of the regression residual.
Table 5. Percentage reduction of the standard deviation of the soil shear strength after eliminating the heteroscedasticity and correlation of the regression residual.
MethodSand, CDCoarse-Grained Breccia SoilAverage
Decrease
OriginalGeneralizedVariation/%OriginalGeneralizedVariation/%
c (kPa) 17.1417.20−0.421.5615.492830.575
φ (°) 1.3661.28461.1911.1880.2514.21
Table 6. Reliability indexes of slope stability for different soil shear strengths using original and generalized regression methods.
Table 6. Reliability indexes of slope stability for different soil shear strengths using original and generalized regression methods.
Types   of   Soil   and   Slope   Ratio Unit Weight
/kN m−3
B / 10 2 MethodMean Value of c Mean   Value   of   f Standard   Deviation   of   c Standard   Deviation   of   f Covariance   of   c   a n d   f Mean of Safety FactorVariance of Safety FactorStandard Deviation of Safety FactorReliability Index
Coarse-grained soil CD, Slope ratio 1:1.3233.63original179.280.94121.560.029−0.3851.2230.0014210.03776.016
generalized174.670.93915.490.028−0.3141.2210.0013720.03705.969
Sand CD,
Slope ratio 1:2.0
194.39original42.8000.64217.140.0280.0703.1610.5810.7622.836
generalized41.1970.63117.200.0270.0463.0690.5800.7622.716
Gravelly clay UU, Slope ratio 1:2.0194.39original158.670.340757.380.041−0.0116.5651.4621.2094.603
generalized159.460.347541.330.024−0.0766.6101.0301.0155.527
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chi, H.; Wang, H.; Jia, Y.; Zou, D. A Method for Determining the Soil Shear Strength by Eliminating the Heteroscedasticity and Correlation of the Regression Residual. Appl. Sci. 2025, 15, 10289. https://doi.org/10.3390/app151810289

AMA Style

Chi H, Wang H, Jia Y, Zou D. A Method for Determining the Soil Shear Strength by Eliminating the Heteroscedasticity and Correlation of the Regression Residual. Applied Sciences. 2025; 15(18):10289. https://doi.org/10.3390/app151810289

Chicago/Turabian Style

Chi, Heng, Hengdong Wang, Yufeng Jia, and Degao Zou. 2025. "A Method for Determining the Soil Shear Strength by Eliminating the Heteroscedasticity and Correlation of the Regression Residual" Applied Sciences 15, no. 18: 10289. https://doi.org/10.3390/app151810289

APA Style

Chi, H., Wang, H., Jia, Y., & Zou, D. (2025). A Method for Determining the Soil Shear Strength by Eliminating the Heteroscedasticity and Correlation of the Regression Residual. Applied Sciences, 15(18), 10289. https://doi.org/10.3390/app151810289

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop