An Efficient Three-Term Iterative Method for Estimating Linear Approximation Models in Regression Analysis

: This study employs exact line search iterative algorithms for solving large scale unconstrained optimization problems in which the direction is a three-term modification of iterative method with two different scaled parameters. The objective of this research is to identify the effectiveness of the new directions both theoretically and numerically. Sufficient descent property and global convergence analysis of the suggested methods are established. For numerical experiment purposes, the methods are compared with the previous well-known three-term iterative method and each method is evaluated over the same set of test problems with different initial points. Numerical results show that the performances of the proposed three-term methods are more efficient and superior to the existing method. These methods could also produce an approximate linear regression equation to solve the regression model. The findings of this study can help better understanding of the applicability of numerical algorithms that can be used in estimating the regression model.


Introduction
The steepest descent (SD) method, founded in 1847 by [1], is said to be the simplest gradient and iterative method for minimization of nonlinear optimization problems without constraints. This method is categorized in a single-objective optimization problem which attempts to obtain only one optimal solution [2]. However, due to the low-dimensional property of this method, it converges very slowly. Therefore, since far too little attention has been paid to the modification of the search direction for this method, this study suggests the three-term direction to solve large-scale unconstrained optimization functions.
The standard SD method for solving unconstrained optimization function is defined as Throughout this paper, without specification, k g is used to denote the gradient of f at the current iterate point, k x and . to denote the Euclidean norm of vectors. The study will also use k f as the abbreviation of   k f x . The superscript T signifies the transpose.
Line search rules is one of the methods to compute (1) by estimating the direction, k d and the step size, k  . Generally, it can be classified into two types, exact line search and inexact line search rules. The inexact line search represents methods known as Armijo [3], Wolfe [4] and Goldstein [5].
Despite the fact that the exact line search is quite slow compared to inexact line search, in recent years, an increasing number of studies adopting the exact line search was discovered due to faster computing powers such as in [6]. This research emphasized the exact line search as we assume that this new era of fast computer processors will give an advantage in using this line search. The remainder of this study is organized as follows: in Section 2, the evolution of the SD method is discussed while in Section 3, the proposed three-term SD methods with two different scaled parameters and their convergence analysis are presented. Next, numerical results of the proposed methods are illustrated and discussed in Section 4 while in Section 5, the implementation in regression analysis of all proposed methods is demonstrated. A brief conclusion and some future recommendations are provided in the last section of this paper.

Evolution of Steepest Descent Method
The issue on search direction modification for SD method has grown importance in light of recent as in 2018, [7] introduced a new descent method that used a three-step discretization method which has an intermediate step between the initial point, 0 x to the next iterate point, 1 k x  . In 2016, [8] proposed a search direction of the SD method that possessed global convergence properties. The search direction of the proposed, named as ZMRI taken by the name of the researches Zubai'ah, Mustafa, Rivaie and Ismail, has improved the behavior of the SD method where a proportion of previous search directions is added to the current negative gradient. This search direction is given by The numerical result of the method revealed that ZMRI has superior performance compared to the standard SD and the method was also 11 times faster than SD.
Recently, inspired by (3), [9] proposed a scaled SD method that also satisfied global convergence properties. The search direction is known as RRM k d which abbreviated from the researcher's name Rashidah, Rivaie and Mustafa, is given by The value of k  was taken from the coefficient in [10] and defined as  [11] presented a three-term iterative method for unconstrained optimization problems motivated from [12][13][14] defined as follows: 1 1 , As researchers can see, the author put a restart feature that directly addresses the jamming problem. When the step is too small, then the factor 1 k y  approaches the zero vector. The author has also proven that the method is globally convergent under standard Armijo-type line search and modified Armijo-type line search. As a result, the numerical experiments for the proposed method is much better than the methods in [12][13][14].

Algorithm and Convergence Analysis of New Three-term Search Direction
This section presents the new three-term search direction for SD method to solve large-scale unconstrained optimization problems. This research highlights the development of SD method that can lessen the number of iterations and CPU time while establishing the theoretical proofs under exact line searches. Motivated by the above evolutions on SD, the new direction formula is obtained as follows: In this research, by employing the parameter from the conjugate gradient method which is said to have faster convergence and lower memory requirements [15], two different scaled parameters, while the second direction known as TTSD2 which is an extension of TTSD1, the parameters are The idea of the extension arises from the recent literature reviews, for instance in [16][17][18][19][20], which seek to improve the performance and effectiveness of the existing methods. The proposed directions with the exact line search procedure were implemented in the algorithm as follows.

Algorithm 1: Steepest Descent Method.
Step 0: Given a starting or initial point 0 x , set 0 k  .
Step 2: Evaluate step length or step size, k  using exact line search as in (2).
Step 3: Update new point, then, stop, else go to Step 1.

Convergence Analysis
This section indicates the theoretical prove that (5) holds the convergence analysis both in sufficient descent directions and global convergence properties.

Sufficient Descent Conditions
Let sequence   k d and   k x be generated by (5) and (1), then 2 for 0 Theorem 1. Consider the three-term search direction given by (5) with the TTSD1 as scaled parameters and the step size determined by the exact procedure (2). Then condition (6) holds for all 0 k  .
Proof. Obviously, if 0 k  , then the conclusion is true. Then, to show that for 1 k  , condition (6) will also hold true.
Therefore, condition (6) holds and thus the proof is complete, which implies that k d is a sufficient descent direction.□

Global Convergence
The following assumptions and lemma are needed in the analysis of the global convergence of SD methods.

Assumption 1. The level set
In some neighborhoods N of  , the objective function is continuously differentiable, and its gradient is Lipchitz continuous, namely, there exists a constant 0 l  such that ( ) ( ) g x g y l x y    for any , These assumptions yield the following Lemma 1.
The proof is done using a contradiction rule. By assuming that Theorem 2 is not true, that is,  The above inequality implies Thus, from (7), it follows that

Numerical Experiments
This section examines the feasibility and effectiveness of Algorithm 1 with the use of (4) and (5)

Number
Functions Initial Points F1 Extended White and Holst [23] (0,0,…,0), (2,2,…,2), (5,5,…,5) F2 Extended Rosenbrock [24] (0,0,…,0), (2,2,…,2), (5,5,…,5) F3 Extended Freudenstein and Roth [24] (0.5,0.5,…,0.5), (4,4,…,4), (5,5,…,5) F4 Extended Beale [25] (0,0,…,0), (2.5,2.5,…,2.5), (5,5,…,5) F5 Raydan 1 [23] (1,1,…,1), (20,20,…,20), (5,5,…,5) F6 Extended Tridiagonal 1 [23] (2,2,…,2), (3.5,3.5,…,3.5), (7,7,…,7) F7 Diagonal 4 [23] (1,1,…,1), (5,5,…,5), (10,10,…,10) F8 Extended Himmelblau [25] (1,1,…,1), ( Extended Penalty [23] (1,1,…,1), (5,5,…,5), (10,10,…,10) F19 Leon [28] (1,1,…,1), (5,5,…,5), (10,10,…,10) For the purpose of comparison, the methods were evaluated over the same set of test problems (see Table 1). The total number of test problems was twenty-six with three different initial points ranging from 2 to 5000 number of variables. The results were divided into two groups, which in the first group was the comparison between the proposed directions with standard and previous SD methods, [8,9] while in the second group the numerical results were compared with another three-term iterative method introduced by [11] using exact line search procedures. Numerical results were compared based on the number of iterations and CPU times evaluated. In the experiments, the termination condition is 5 10 k g   . We also forced the routine to stop if the total number of iteration exceeded 10,000. For the methods being analyzed, a performance profile introduced by [22] was implemented to compare the performance of the set solvers S on a test set of problems P. Assuming In order to get the overall evaluations of the solver's performance, they defined s t  ( ) as a probability for a solver s S  that p s r , was within a factor t R  of the best possible ration. The 0,1 was for a solver was piecewise non-decreasing and continuous from the right at each breakpoint. Generally, the higher value of s t  ( ) or in other words, the solver whose performance profile plot is on the top right will win the rest of the solvers or represents the best solver. Figure 1 show the comparison of the proposed method with the standard SD, ZMRI and RRM methods. In Figure 2 in order to emphasize the proposed search direction from the direction in [11] abbreviated as WH, it might call the present formula as first and second three-term SD methods, TTSD1 and TTSD2, respectively. The performance for all the methods, referring to the number of iterations evaluated and central processing unit (CPU) time, respectively, are displayed. From the above figures, the TTSD1 method outperforms the other methods in both the number of iterations and CPU time evaluations. This can be seen from the left side of Figures 1 and 2 in which TTSD1 is the fastest method in solving all of the test problems and from the right side of the figures, this method also gives the highest percentage of successfully solved test problems compared to other methods. The probability of all the solvers or the methods involved was not approaching 1 which means that they are not able to solve all of the problems tested. The percentage of the successful problems solved by each solver is tabularized in Table 2. Table 2 also presents the CPU time per single iteration based on the evaluation of the total iterations and total CPU times. Although the performance of other methods seems to be much better than the proposed method, TTSD1 and TTSD2 can be considered as the superior method since it can solve 81.02% and 82.97% of the functions tested.

Implementation in the Regression Model
In modern times, optimal mathematical models have become common resources for researchers, for instance, in the construction industry, these tools are used to find a solution to minimize costs and maximize profits [31]. Steepest descent method is said to have various applications mostly in finance, network analysis and physics as it is easy to use. One of the most frequent employment of this method is in regression analysis. This paper aims to investigate the use of the proposed direction in describing the relationship between fin dorsal length and the total length of silky shark. The data were collected by [32] from March 2018 to February 2019 at Tanjung Luar Fish Landing Post, West Nusa Tenggara. The study was carried out to set the minimum size of fin products for international trade and the author also pointed out that this data can be used by the fisheries authority to determine the allowed minimum size of silky shark fins for export. Figure 3 shows the linear approximation between the total length and the length of a dorsal fin of silky shark as 0 125610046 0 018027898 y .
x .   . In order to measure the model performance, the coefficient of determination, R 2 , has been calculated as a standard metric for model errors and it showed that the value of R 2 is close to 1, means there is a strong relationship between the total length of silky shark with the length of its dorsal fin. The total length of the silky shark was measured from the anterior tip of the snout to the posterior part of the caudal fin while the dorsal fin length measured from the fin base to the tip of the fin as shown in Figure 4.  The linear regression analysis was implemented by using the dorsal fin length as dependent variables y and the total length of a silky shark as an independent variable x with a model is The sum squares of S can be minimized by utilizing the concept of calculus, differentiating (8) with respect to all the parameters involved. The equations can be written in a matrix form and lead to the system of the linear equation. By using the inversion of the matrix method to solve the system of linear equation, the solution is derived as Another method to find the solution of a system of linear equation is by using the numerical method. In this context, the proposed three-term method is implemented as a numerical method to solve the system as a comparison with the aforementioned inversion of the matrix. To test the efficiency of the proposed method TTSD1 and TTSD2, Table 3 gives an overview of the estimations model coefficients using an inverse method, TTSD methods and also WH method with the number of iterations (with initial point is   , 0 0 ). The accuracy and performance of these methods are measured by the sum of relative errors by using the total of the differences between the approximation and the exact values of the data. The sum of relative errors are tabulated in Table 3 where the equation of the relative errors is defined as

Exact Value -Approximate Value Relative Error = Exact Value
where the exact value gained from the actual data and the approximate value is the value obtained by each method involved. From Table 3, it can be observed that TTSD1 has the least value of errors followed by the inversion matrix method and TTSD2 which implies that these two methods are comparable with the direct inverse method.

Conclusions and Future Recommendations
The main objective of this paper is to propose a three-term SD method also known as the iterative method with two different scaled parameters. The effectiveness of the method, TTSD1 and TTSD2, were tested by comparing with the previous SD (standard, ZMRI and RRM) and three-term method presented in [13], named the WH method, using the same set of test problems under exact line search algorithms. The proposed method possesses sufficient descent and global convergence properties. Through several tests, the method TTSD1 and TTSD2 really outperform the previous SD and other three-term iterative methods. The reliability of TTSD1 and TTSD2 was found to be consistent with the results obtained by the direct inverse method for the implementation in the regression analysis. This finding shows that the methods are comparable and applicable. There is abundant room for further research on the SD method. In the future, we intend to test this new TTSD1 and TTSD2 using the inexact line search.