Gradient Descent-Based Optimization Method of a Four-Bar Mechanism Using Fully Cartesian Coordinates

Machine vibrations often occur due to dynamic unbalance inducing wear, fatigue, and noise that limit the potential of many machines. Dynamic balancing is a main concern in mechanism and machine theory as it allows designers to limit the transmission of vibrations to the frames and base of machines. This work introduces a novel method for representing a four-bar mechanism with the use of Fully Cartesian coordinates and a simple definition of the shaking force (ShF) and the shaking moment (ShM) equations. A simplified version of Projected Gradient Descent is used to minimize the ShF and ShM functions with the aim of balancing the system. The multi-objective optimization problem was solved using a linear combination of the objectives. A comprehensive analysis of the partial derivatives, volumes, and relations between area and thickness of the counterweights is used to define whether the allowed optimization boundaries should be changed in case the mechanical conditions of the mechanism permit it. A comparison between Pareto fronts is used to determine the impact that each counterweight has on the mechanism’s balancing. In this way, it is possible to determine which counterweights can be eliminated according to the importance of the static balance (ShF), dynamic balance (ShM), or both. The results of this methodology when using three counterweights reduces the ShF and ShM by 99.70% and 28.69%, respectively when importance is given to the static balancing and by 83.99% and 8.47%, respectively, when importance is focused on dynamic balancing. Even when further reducing the number of counterweights, the ShF and ShM can be decreased satisfactorily.


Introduction
A complete mechanical balance or dynamic balance of a mechanism consists of eliminating the dynamic reactions at the base of a mechanism produced by the movement of its structure. These dynamic reactions are the shaking force (ShF) and the shaking moment (ShM). Such balancing is desirable because the ShF and ShM cause vibrations of the supporting frame which latter turn into noise, fatigue, wear, etc. The four-bar mechanism is often taken as example in dynamic balancing because its number of applications [1,2] Figure 1 shows a four-bar linkage consisting of four rigid bodies located in the same plane. A motor is located at point A, so the crank AB rotates with velocity ω[rads/s]. l AB , l BD and l CD represent the length of each link. By using Fully Cartesian coordinates [9] to represent the system, it is easy to obtain the equations of the ShF and ShM. This method has been previously used in [22,25,26]. To obtain the mass-matrix M of the whole mechanism, it is necessary to define the mass-matrix M n of each n link. Those matrices can be calculated as presented in [9].

Mass-Matrix of the Four-Bar Linkage Using Fully Cartesian Coordinates
As long as the balance is carried out by adding counterweights, it is necessary to include in M the related parameters. Therefore, each link will be considered as the conjunction of one-bar and one counterweight as shown in Figure 2. The mass m n of each link-counterweight n = 1, 2, 3 is defined as: m n = m bn + m cn (1) where m bn is the mass of the original bar n (n = 1, 2, 3) and m cn is the mass of each counterweight (n = 1, 2, 3). In addition, the counterweight mass in terms of its density and size can be represented as: m cn = πρ cn t cn y cn 2 + x cn 2 (2) where ρ cn is the density of the material used in each counterweight, t cn is the thickness of each counterweight, and x cn , y cn represents the position of the center of mass of the counterweight measured in relation with the local coordinate system O n . The total inertia of each link-counterweight I n , where n = 1, 2, 3, is defined as: I n = 3m cn y cn 2 + x cn 2 2 + I bn (3) where x cn is the x position of the mass center of the counterweight corresponding to the n element measured from the origin O n and y cn is the y position of the mass center of the counterweight corresponding to the n element measured from the origin O n . I bn corresponds to the inertia of each original bar.
The new x position of the mass center for each n element is defined as: X Gn = m cn x cn + m bn x bn m cn + m bn (4) And the new y position of the mass center for each n element is defined as: Y Gn = m cn y cn + m bn y bn m cn + m bn (5) where (x cn , y cn ) is the position of the center of mass of each counterweight, and (x bn , y bn ) is the position of the center of mass of each original bar, both measured from the origin O n .
To avoid the use of extra variables, the mass of each counterweight m cn is defined as: m cn = π(r 2 cn )(t cn )ρ cn (6) where ρ cn represents the density of the counterweight, t cn is the thickness of the counterweight, and r cn represents the ratio of the counterweight. In addition, if the counterweight is considered to be a cylinder, the radius of the counterweight can be defined as: By substituting Equation (7) in (6): and applying the concepts of [9], the mass-matrix representing the whole mechanism can be written as: With:

Linear Momentum and Shaking Force
Once the mass-matrix of the whole system is known and based on the basic points of the whole linkage, it is possible to introduce a vector of positions represented by q: By time-deriving Equation (20), a vector of velocities can be obtained: By introducing matrix B, it is possible to obtain the linear momentum vectors L associated to the whole system: Where B (Equation (23)) is a matrix formed by identity matrices for each of the basic points founded in the mechanism: By solving Equation (22) and considering the velocity of the fixed points always equal to zero (VA X = 0, VA Y = 0, VC X = 0, VC Y = 0), the expressions of linear momentum (L i and L j ) for the linkage are obtained.
The shaking force ShF (Equations (24) and (25)) of the linkage can be calculated from the derivation of Equations L i and L j (Equation (22)). To ensure the linkage is force balanced, these equations must be constant (usually zero) over all the analyzed period of time, thus warranting a null shaking force.

Angular Momentum and Shaking Moment
The use of Fully Cartesian coordinates allows to express the angular momentum H as Equation (26) with r defined as shown in Equation (27).
By solving Equation (26) and considering VA X = 0, VA Y = 0, VC X = 0 and VC Y = 0, the expression of the angular momentum H can be obtained.
A scalar value corresponding to the shaking moment ShM of the linkage that is considered perpendicular to the plane of the 2D mechanism, can be calculated by time-deriving H: where:ṙ andq, the time-derivation of vectorq, represents the acceleration vector.
To guarantee the dynamic balancing of the mechanism, ShM must be kept constant: The time-derivation of H (Equations (28) and (29)) should be zero.
By solving Equation (29) and considering VA X = 0, VA Y = 0, VC X = 0 and VC Y = 0, the ShM of the four-bar linkage can be obtained:

Objective Function
A dimensionless balancing index β i can be used to define the optimization's objective function. This kind of expressions has been previously used in [14,24], and more recently in [20]. As shown in Equation (32), the first balancing index is defined by the relation of the root mean square (rms) of the reaction of the optimized linkage (rms( o Reaction)) with respect to the rms value of the reaction of the original linkage (rms(Reaction)), both considered over a period of time T.
Two objective indexes will be considered taking in account the reactions of the ShF and ShM. As long as they represent the ratio between the reactions of the original mechanism and the optimized one, the result will present the achieved optimization. The value is in the range (0, 1). The nearest this value is to 0, the best balance is achieved. Otherwise (a value close to 1), the result is almost the same that the one obtained without balancing.
The first balancing index β ShF is shown in Equation (32).
where the root mean square of the ShF (rms(ShF)) is given by: and the root mean square of the original ShF (rms( o ShF)) is a constant value obtained from calculating the root mean square of the ShF of the four-bar linkage without any added counterweight.
By substituting Equations (33) and (34) in Equation (32), the balancing index (β ShF ) can be expressed as: Similarly, the second objective index can be calculated when the ShM is considered as the reaction. So, β ShM can be written as: where ShM is the shaking moment of the optimized linkage when using the added counterweights and o ShM is a constant value that represents the shaking moment of the unbalanced linkage. The optimization objective is to minimize β ShF and β ShM considering boundary limits while ensuring that the coordinates of the centers of mass (x cn and y cn ) and the thickness (t cn ) of each counterweight are dimensions that can be used in the mechanical context. Therefore, the boundaries for optimization are limited according to:

Optimization
The optimization theory deals with selecting the best alternative in the sense of the given objective function [27]. It can be applied to solve a wide variety of problems, for example: [28][29][30].
Mathematically, optimization is the process of finding the minimum or the maximum of a function f (X) : R n − > R. X ∈ R n is the vector of variables that can be modified in order to optimize f (X). When minimizing the function f , an optimal solution can be defined as X * where f (X * ) ≤ f (X) for all X ∈ R n . In other words, it is a global minimum.

Simplified Version of the Projected Gradient Descent
Gradient Descent is an iterative algorithm that finds a local minimal in a function [21,27,28]. It starts in a random point and continues until the minimal is reached. This technique is based on the use of the gradient vector to update the solution. The gradient vector ∇f(X) evaluated in a point X points the direction of a maximum ascent. Gradient Descent moves the point in the opposite direction to the gradient. The problem starts with the solution vector X 0 and in each iteration X k is modified according to Equation (40).
where k is the current iteration, α k is the step length, and P k is the direction calculated as an unitary direction vector as shown in Equation (41).
The step length α k is optimized based on the approximation of the Taylor's Theorem (Equation (42)), ensuring this way the maximum descent.
By deriving Equation (42) with respect to α and clearing α:

Finite Difference
Calculating the derivatives in a symbolic way may be difficult on certain occasions. Finite Differences [21] can be used to obtain an approximation of the Gradient Vector and the Hessian matrix.
Mathematically, the Gradient vector ∇f(X) is formed by the first derivatives of the function with respect to all the variables (Equation (44)) and the Hessian matrix ∇ 2 f(X) is composed of the second derivatives (Equation (45)).
To approximate the first partial derivatives, Equation (46) is used. Here, is a scalar constant with a small value ( = 1e − 5), e i is a unitary vector with the size of X that contains 0's in all its positions except in the position i where there is a 1.
Since the Hessian matrix will be multiplied by a vector, a finite approximation is used (Equation (47)), where V is the vector to be multiplied.

Implementation to Optimize the ShF and ShM
The optimization problem consists of minimizing β ShF and β ShM (see Equations (35) and (36)). X is the vector that contains all the variables values that need to be calculated, this is, X = [x c1 , y c1 , t c1 , x c2 , y c2 , t c2 , x c3 , y c3 , t c3 ]. As two functions need to be optimized, β ShF and β ShM , it can be stated that this is a multi-objective problem. In this research, the objectives are in conflict, it means that, when optimizing the values of X for minimizing β ShF , a worse value of β ShM can be obtained and vice-versa. All the solutions X can be represented in a two dimensional plane with the pair (β ShF (X), β ShM (X)), as it can be seen in Figure 4. We say a solution is dominated X d when its two objectives are worse than at least other solution, this is, ∃x : Figure 4b shows an example of non-dominated (dark blue) and dominated solutions (light blue). The Pareto Front [17] is conformed by the non-dominated solutions. According to the specific mechanical reaction that is more urgent to optimize, it is recommended to choose a specific solution from the Pareto Frontier.
The technique weighted sum [31], that is widely known for solving multi-objective problems, was used for defining the objective function of the optimization problem (Equation (48)).
where γ is a scalar value that determines the importance given to each of the objectives of the optimization.
The simplified version of Projected Gradient Descent algorithm is used for solving the optimization problem (Equation (48)). It is described in Algorithm 1. RandomBoxConstraints() calculates a vector of random variables with uniform distribution respecting the specified bounds.
GradientFiniteDi f f ( f , X 0 ) and HessianFiniteDi f f ( f , X k , P k ) calculate the approximation of the Gradient vector and the Hessian matrix based on the Equations (44), (46) and (47). . represents the vector norm. is a small scalar ( = 1e − 5).
To handle bound constraints, a simplified version of Projected Gradient Descent is used, so when a solution is found outside of the boundaries, it is projected to the valid region. The Boolean function isInvalid(X k+1 ) returns true if the obtained vector contains values inside the boundaries otherwise, it returns false. The function clip(X k+1 ) is used to clip the values of the vector if they are not valid.
Clipping the values implies that if the value is smaller than the lower value allowed, then the value is converted to the minimal. In the same way, if the value is bigger than the maximum value allowed, then the value is converted to the maximum.
The algorithm implements two stop conditions: (1) ||X k+1 − X k || < . It means that the difference between X k+1 and X k is too small, so there is no change in the current solution, this also includes when the Gradient vector norm is close to 0. (2) k > N MaxIter. It means that the maximum number of iterations has been reached (In this work N MaxIter = 1000).
Algorithm 1: Projected Gradient Descent with maximum descent.

10
If ||X k+1 − X k || < then break; 11 k = k + 1; 12 end 13 return X k ; The objective function defined in Equation (48) is non convex, therefore it has different local minimums. For this reason, the algorithm 1 is executed 500 times, each one with a different starting point with the goal of finding the different local minimums. Based on a random search for hyperparameters presented in [32], γ is taken as a random value from a uniform distribution in the range (0, 1) (see Equation (48)). When the optimization objective values are less than 1.0, the resultant parameters are stored.

Sensibility Analysis
This subsection shows a way to analyze the Gradient vector values and the constraints. The optimization problem can be defined as: And the constraints can be redefined as: In this case, the Lagrangian is given by: and the KKT : ∇f(X) − δ + π = 0 (52) with δ, π >= 0, defining λ = δ − π ∇f(X) − λ = 0 (53) Then In other words, if the i − th entry of the Gradient has a value equal to 0, the constraint is not affected. If the value is greater than 0, the constraint of the lower border is affected. Finally, if the value is smaller than 0, the constraint of the higher border is affected.

Analysis
This section presents a deep analysis that provides information on the importance of each counterweight and its influence on the balance of the mechanism. It also presents a method to determinate if the optimization limits are the most appropriate or if it is convenient to change them in case the mechanical limitations of the system allows it. Table 1 presents the parameters of a four-bar linkage (Figure 2). The material used for the links is steel with a density of 7800 kg/m 3 . The counterweights are made from brass with a density (ρ c ) of 8500 kg/m 3 .

Mechanical Characteristics
The mechanism is moved by a motor placed at point A, rotating at a constant speed of 500 rpm. Using direct kinematics, it is possible to obtain a sample of the positions (A x , A y , B x , B y , C x , C y ,D x , D y ), speeds ( VB x , VB y , VC x , VC y , VD x , VD y ) and accelerations AB x , AB y , AC x , AC y , AD x , AD y ) corresponding to each of the basic points considered in this system. Figure 3 shows the results of the direct kinematics using the mechanism parameters presented in Table 1

Numerical Results
Among the solutions found in the Pareto front, it is possible to select the one that is the most appropriate according to the specific problem that is being solved.
As an example, three solutions of the Pareto front are taken; the first one is the best result when optimizing the index corresponding to the ShM (β ShM = 0.1600587, β ShF = 0.9152829), the second one is the best result when optimizing the index corresponding to the ShF (β ShM = 0.71311372, β ShF = 0.00295769) and the last one is selected when both ShM and ShF indexes are optimized by almost 60% (β ShM = 0.42969434, β ShF = 0.45176319).
The first chosen solution is the one in the Pareto front with the minimum value in β ShM (β ShM = 0.1600587, β ShF = 0.9152829). It corresponds to the following variables values: x c1 = −0.306033474 y c1 = −0.05120233 t c1 = 0.007840031 x c2 = −0.031811247 y c2 = −0.09348429 t c2 = 0.04 x c3 = −0.115671647 y c3 = 0.112686771 t c3 = 0.04 Figure 5a shows the comparison between the ShF (on the x and y axes) of the original mechanism and the ShF after the optimization. In this case, the total ShF was improved only by 8.47%. But with this solution, it is possible to appreciate that the ShM of the four-bar optimized mechanism is 83.99% better than the original one (Figure 5b). It can be observed that upon the sole use of counterweights, it is not possible to eliminate the ShM, but it can be significantly reduced, reducing also the ShF a bit.  In Figure 5c, it is possible to appreciate that the ShF is significantly reduced (99.70%) compared to the original mechanism. It can be considered that the ShF is almost completely eliminated by using counterweights and also that the ShM has a reduction of 28.69% (Figure 5d).

Partial Derivatives, Volumes and Relation Area-Thickness of Three Counterweights
This subsection presents an analysis to determine if the proposed limits for optimization are the most suitable or if they should be changed (in case there is the possibility of modifying them due to mechanical constraints).
In Figure 6 the box-plots of the partial derivatives with respect to each variable x n , y n and t n for each counterweight n (1 ≤ n ≤ 3) are shown. It is known that an optimal solution is found if all the partial derivatives values are equal to zero. In the box plots of Figure 6 it can be seen that, for the variables x n and y n of the counterweights n (1 ≤ n ≤ 3), the partial derivatives are close to zero, this means that it was possible to reach the optimal values within the proposed optimization limits.
However, for the variables t 1 and t 3 , the partial derivatives are not close to zero. In t 1 , it can be appreciated that the value tends to be greater than zero, hence, it can be deduced that the thickness of Counterweight-1 is trying to be less than the limit 0.005 m. Evidently, this is not possible due to the mechanical limitations that prevent a counterweight thickness from being too close to zero or negative because it is physically impossible. On the other hand, the value of t 3 tends to be less than 0, which means that if the limits of the optimization allowed it, Counterweight-3 would have a thickness greater than 0.04 m. This information, obtained from the partial derivatives analysis, can be very useful to decide the limits of the counterweights when there is the possibility to modify them.
In Figure 7a, the volume value histogram of the counterweights obtained on the different optimization solutions is presented. By analyzing the volume of the counterweights and the relation between their area and thickness (Figure 7b), it can be seen that Counterweight-2 has a very small volume (compared with the other counterweights) and when both area and thickness are very small, Counterweight-2 is almost disappearing from the solution. On the other hand, the relation between the area and the thickness of Counterweight-3 shows how in almost all the cases the thickness is reaching the highest allowed limit; this confirms the information given by the histogram previously analyzed and confirms the conclusion that if mechanical characteristics of the system allows it, it might be advisable to perform the optimization with a slightly larger upper limit for the variable t 3 .
(a) Volumes of counterweights when using three of them (b) Relation between area and thickness of each cylindrical counterweight when using three of them.

Decreasing the Amount of Counterweights Used for Balancing
There could be cases when it is desirable to eliminate one or more counterweights. The reasons can be the resultant volume of the whole mechanism or the cost for implementing the solution. Figure 8 shows the Pareto front of the different optimization results when using all possible combinations (three, two, or only one counterweight). The black stars correspond to the original Pareto front when the three counterweights are used. The blue crosses correspond to the Pareto front when Counterweight-2 has been eliminated and only Counterweights 1 and 3 are being considered. Comparing these results with those obtained when using all the counterweights, it is possible to see that the Pareto fronts are very similar.

Pareto Front Comparison to Eliminate One Counterweight
The green crosses correspond to the Pareto front when using only Counterweights-2 and 3. The yellow crosses correspond to the Pareto front when Counterweight-3 has been eliminated and Counterweights-1 and 2 are in use.
Using the information provided by the Pareto fronts (Figure 8), it is possible to conclude that, if one decides to eliminate a counterweight to simplify the balancing of the mechanism, it should be Counterweight-2, since using only Counterweights-1 and 3 produces a similar result to that obtained when all three counterweights are used. Appendix A shows the numerical results when using only two counterweights giving importance to the optimization of the ShF, ShM, and both. A sensitivity analysis is also presented to decide which counterweight should be used if one decides to use only one counterweight. The partial derivatives, volumes and relation area-thickness is also presented giving information about the selected optimization limits, allowing to further improve the results.
Appendix B shows the numerical results when using only one counterweight and presents the sensitivity analysis and the partial derivatives, volumes and relation area-thickness using the methods previously proposed. Table 2 shows the optimization results when using three, two, and one counterweight.

Expanding Optimization Limits for t 3
As aforementioned, using the box-plots of the partial derivatives with respect to each variable, the volume values histogram of the counterweights, and the graphics of the relation between area and thickness, it is possible to obtain valuable information about the chosen limits of the optimization. The decision of modifying these limits depends on the mechanical characteristics of the whole mechanism.
Taking the example when using only one counterweight (Appendix B) the recommendation after the analysis was to choose a higher limit of thickness t 3 , a new optimization process was executed with the limits: The upper limit of t 3 could be changed from 0.04 m to 0.05 m. Using this value and giving more importance to β ShF , the physical characteristics of Counterweight-3 should be: x c3 = −0.141021300369708 y c3 = −0.00665773774579448 t c3 = 0.0207544372440781 Using this solution, allows the ShF and ShM to be reduced by 73.92% and 14.32%, respectively. On the other hand, when the importance is given to β ShM , the physical characteristics of Counterweight-3 should be: Using this counterweight the ShM and ShF is reduced by 80.17% and 5.30%, respectively. Table 3 compares the optimization results obtained with the expanded limits and the original limits. It can be seen that expanding Counterweight-3 thickness upper limit, improves the ShM. Figure 9 shows the comparison of Pareto fronts before and after changing the optimization limits. It can be noticed that the thickness of Counterweight-3 has a greater influence on the optimization of the ShM of the mechanism. By increasing this limit, optimization results can be better.

Conclusions
By using fully Cartesian coordinates to represent a mechanism, the equations that define the reactions are less complex than those obtained with other methods, hence, the use of this kind of coordinates is suitable for complete balancing, minimization of reactions, and calculation of the ShF and ShM in mechanisms.
The use of fully Cartesian coordinates to represent a four-bar linkage in conjunction with the Simplified Gradient Descent algorithm is a suitable methodology to optimize the balancing of mechanisms. It allows, when using three counterweights and giving more importance to static balancing, to reduce the ShF and ShM by 99.70% and 28.69%, respectively; and when importance is given to the dynamic balancing, to reduce the ShM and the ShF by 83.99% and 8.47%, respectively.
The optimization algorithm was successfully applied to solve the problem. The use of linear combination of functions is a simple yet robust way to handle multi-objectives. The approximation of the derivatives based on Finite Difference allows guiding the algorithm and reducing human hand calculation mistakes.
Comparison between Pareto fronts proves to be an adequate methodology for the sensitivity analysis of each counterweight. This method has proved that even when using only one counterweight, the ShF can be reduced by 78.74% when giving importance to the static balancing or the ShM can be reduced by 73.61% when giving importance to dynamic balancing.
The box-plots of the partial derivatives with respect to each variable, histograms of volumes, and relations between area and thickens allow to analyze the proposed optimization limits in order to decide if they can be changed to obtain even better results.
As future work, it is expected to use these algorithms and analysis to optimize more complex mechanisms in two and three dimensions.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

ShF
Shaking Force ShM Shaking Moment Figure A1. Box plots of partial derivatives with respect to each optimization variable when using two counterweights.
(a) Volumes of counterweights when using two of them (b) Relation between area and thickness of each cylindrical counterweight when using two of them Figure A2. Parameters used to analyze the dimension of each counterweight when using two of them.

Appendix B. Analysis Using Only One Counterweight
This appendix presents the analysis when using only one counterweight.

Appendix B.1. Numerical Results
Three solutions are selected form the Pareto front using only Counterweigh-3, they are chosen as follows: 1. If the interest is to optimize the index related to the ShF (β ShF ), without giving importance to the index related to the ShM (β ShM ), the selected solution is: x c3 = −0.11950597007297 y c3 = 0.0984494486824036 t c3 = 0.04 By using this solution, the ShF can be reduced by 78.74%, while the ShM by 0.42%. 2. If the interest is to optimize the index related to the ShM (β ShM ) without giving importance to the index related to the ShF (β ShF ), the selected solution is: x c3 = −0.240986266107467 y c3 = 0.0012908547612233 t c3 = 0.00715115109817927 By using this solution, the ShM can be reduced by 73.61% while the ShF by 3.22%. 3. If the interest is to optimize both indexes (β ShF and β ShM ) the selected solution is: x c3 = −0.130165416712609 y c3 = 0.0500126887169449 t c3 = 0.04 This solution reduces the ShF by 51.19% and the ShM by 53.31%.
In Table 2 the comparison between the optimization results when using three, two and only one counterweight can be observed. It is evident that the best results are obtained when using three counterweights. Still, it is interesting to notice that when using only Counterweight-3, it is possible to improve the balance of the whole mechanism and reduce the total cost of the implementation.
Appendix B.2. Partial Derivatives, Volumes, and Relation Area-Thickness When Using Only One Counterweight Figure A3 shows the box-plots of the partial derivatives with respect to each variable x 3 , y 3 and t 3 when using only the third counterweight. It can be observed that the optimization limits for x 3 and y 3 are adequate, but as t 3 tends to be negative, it is actually trying to become even bigger. This can be confirmed by analyzing the relation between the area and the thickness of Counterweight-3 in Figure A4 where the thickness in all the cases is trying to take higher values. So, if the solution will be implemented using only one counterweight, it could be interesting to allow a higher limit on the optimization of this variable, so even better results can be achieved.