Geographically Weighted Three-Parameters Bivariate Gamma Regression and Its Application

: This study discusses model development for response variables following a bivariate gamma distribution using three-parameters, namely shape, scale and location parameters, paying attention to spatial effects so as to produce different parameter estimator values for each location. This model is called geographically weighted bivariate gamma regression (GWBGR). The method used for parameter estimation is maximum-likelihood estimation (MLE) with the Berndt–Hall–Hall-Hausman (BHHH) algorithm approach. Parameter testing consisted of a simultaneous test using the maximum-likelihood ratio test (MLRT) and a partial test using Wald test. The results of GWBGR modeling three-parameters with ﬁxed weight bisquare kernel showed that the variables that signiﬁcantly affect the rate of infant mortality (RIM) and rate of maternal mortality (RMM) are the percentage of poor people, the percentage of obstetric complications treated, the percentage of pregnant mothers who received Fe3 and the percentage of ﬁrst-time pregnant mothers under seventeen years of age. While the percentage of households with clean and healthy lifestyle only signiﬁcant in several regencies and cities.


Introduction
Distribution is a statistical concept used in data research. Statisticians seeking to identify the outcomes and probabilities of a particular study will chart measurable data points from a data set, resulting in a probability distribution diagram. Statisticians can identify the development of either a discrete or continuous distribution by the nature of the outcomes to be measured. Discrete distributions have a countable number of outcomes, which means that the potential outcomes can be put into a list. A continuous distribution is built from outcomes that fall in a continuum.
One type of continuous distribution is gamma distribution. There is some research on gamma distribution that focuses on the estimation of parameters, such as parameter estimation of gamma distribution with three-parameters using maximum marginalized order statistics likelihood estimation (MMosLE). Then the results obtained were compared with the second alternative method is the method of location and scale parameters free maximum-likelihood estimators (LSPF-MLE) and Bayesian methods (BL) [1]. Then the gamma distribution parameters for k-variate of three-parameters were estimated using the heuristic approach. The estimation results were obtained through simulation using the maximum likelihood function, the maximum product of spacing (MPS) and the least square (LS), where LS produces a better estimate value [2]. Rahayu, Purhadi, Sutikno and Prastyo [3] conducted a study on the theory of parameter estimation and hypothesis testing for trivariate gamma regression. As a development, Rahayu, Purhadi, Sutikno and Prastyo [4] studied the theory of parameter estimation, hypothesis testing and its application to multivariate gamma regression.
Gamma regression is used for modeling cases with continuous response variables that have a positive skewness. Gamma regression for two response variables is called bivariate gamma regression (BGR). This research focuses on parameter estimation using the MLE method and simultaneous hypothesis testing using MLRT and partially using the Wald test.
Each regression model produces parameters that apply globally to all locations' analyzed observations. The interpretation of the global model assumes that each observation locations have the same characteristics. However, in fact, there are certain cases that each location has different characteristics. Differences in these characteristics are influenced by geographical factors, economic, cultural or other factors so that the presence of spatial effects needs to be considered. Regression models that take into account the spatial heterogeneity that is geographically weighted regression (GWR). GWR is part of spatial analysis with weighting based on the position or the distance of the location of the observations with other observation sites [5].
Research on the GWR had been done and applied in many cases in fields, such as health, the environment and household items, for such modeling cases as the proportion of households that have a telephone in the state of Sao Paulo, Brazil [6]. A study analyzing the causes of haze pollution in China, the results showed that the GWR model performed better than global models [7]. And a study investigating the relationship between the level of disease with social and economic characteristics in Tokyo [8].
The GWR model for data with a two-parameter gamma distribution has been carried out both on univariate and bivariate data. Research on parameter estimation and hypothesis testing models gamma geographically weighted regression (GWGR). Researchers using the maximum-likelihood estimation method to find parameter estimation, but the results show that the MLE parameter estimation with no closed-form, then the numerical optimization using algorithm Broyden-Fletcher-Goldfarb-Shanno (BFGS) [9]. Then in bivariate data, that is modeling the case of the rate of maternal mortality (RMM) and rate of infant mortality (RIM) in North Sumatra province in 2017 with geographically weighted bivariate gamma regression (GWBGR).
RIM and RMM are examples of quantitative data. This issue was even included in the Sustainable Development Goals (SDGs), which were declared by the member states of the United Nations (UN) as a goal toward global development as a follow-up program of the Millennium Development Goals (MDGs). SDGs targets for maternal and child health, namely the achievement of RMM, are less than 70 per 100,000 live births (KH), and RIM as low as 25 per 1000 KH [10]. The MDG target for RMM is 102/100,000 KH, but the RMM in North Sulawesi, Gorontalo and Central Sulawesi, respectively, for 169, 253.42 and 208 [11][12][13], means that the target is not reached. RIM has met the MDG target which is below 32/1000 KH, but from year to year RIM has not experienced a significant decline for the three provinces.
Here is shown a histogram of the data RIM and RMM for two-parameter gamma and gamma three-parameters. Figures 1 and 2 show that the shape of the histogram of data RIM and RMM have a pattern similar to the gamma three-parameters compared with the two parameters; this is due to the role of the location parameter gamma distribution of three-parameters, namely, shifting the distribution in the horizontal axis.
Currently, research related to gamma regression is limited to the number of response variables and does not involve spatial effects. Therefore, in this study, we developed a regression involving two response variables that followed a bivariate gamma distribution and involved spatial effects. Previous studies that discuss GWGR and GWBGR are limited to two parameters: the shape and scale parameters. As shown in Figures 1 and 2, the RIM and RMM data from the province of North Sulawesi, Gorontalo and Central Sulawesi, this research models RIM and RMM with three-parameters where the two are correlated and gamma distribution. The focus of this study is to determine the parameter estimator and test statistics BGR and GWBGR for three-parameters in the case of RIM and RMM in North Sulawesi, Gorontalo and Central Sulawesi. If the MLE method produces an equation that is not closed form, BHHH methods will be used.

The Gamma Distribution
Y is a random variable with a gamma distribution for three-parameters that have a density function as follows [14].
When two gamma response variables distributed three-parameters that are not independent, it is called bivariate three-parameter gamma distribution. Suppose a random variable that is independent with V i ∼ Gamma(α i , θ, γ i ), i = 1, 2. The density function for z 1 , z 2 can be written as follows [14,15].
Here is a bivariate regression model two-parameter gamma.
The statistical test is: The conclusion is to reject the null hypothesis if |t| > t (α/2;(n−2)) . Multicollinearity refers to a situation in which two or more independent variables in a multiple regression model are highly linearly related. The variance inflation factor (VIF) is often used to measure the degree of multicollinearity with the following formula [17].
with R 2 j is the proportion of the variability between x j the other predictor variables. Their effect on the spatial data can be seen from the spatial heterogeneity among locations. Spatial heterogeneity due to differences in the characteristics of the location of the observations with observations in other locations. Spatial heterogeneity testing can be performed using the hypothesis as follows.
A statistical test is used: withΣ the null hypothesis. Criteria for rejecting the null hypothesis if |G| > χ 2 (α,2k) where k is the number of predictor variables with a significance level α.
The formation of the Euclidean distance weighting function can be determined by using a kernel function as for some kernel functions proposed in [18].

1.
Fixed function Gaussian kernel with: w ii * = the weighting of observations on the i-th location and the location of all i * , g = bandwidth, d ii * = Euclidian distance.

4.
Adaptive Gaussian kernel The corrected Akaike information criterion (AICc) value is calculated by the following formula [19].
with n is the sample size and K is the number of parameters.

Data and Method
The data used from this study came from the publication of the Health Profile of North Sulawesi, Gorontalo and Central Sulawesi in 2016 and Welfare Statistics Publications North Sulawesi, Central Sulawesi and Gorontalo (Sulutenggo) in 2016. The observation unit was 34 regencies and cities in North Sulawesi, Gorontalo and Central Sulawesi. The data used were two response variables, i.e., rate of infant mortality and maternal mortality, with six predictor variables: the percentage of poor people, percentage of obstetric complications are handled, the percentage of pregnant mothers who received Fe3, the percentage of first-time pregnant mothers under seventeen years of age, the percentage of use of health facilities, and the percentage of households with clean and healthy lifestyle.
The parameter estimators are obtained using the MLE. The simultaneous test for GWBGR's significance was done using MLRT. The partial test for individual parameter significance in GWBGR was done using the Wald test.

Parameter Estimation of GWBGR with Three-Parameters
Model GWBGR three-parameters can be formed as follows.
. GWBGR the model parameter estimation using maximum-likelihood estimation (MLE). The first step is to establish a likelihood function with the set parameters The log-likelihood function is shown as follows.
The following is the log-likelihood function for each location.
The first derivative of the log-likelihood function for θ is: The first derivative of the log-likelihood function for γ 1 is: The first derivative of the log-likelihood function for γ 2 is: The first derivative of the log-likelihood function for β 1 (u i * , v i * ) is: The first derivative of the log-likelihood function for β 2 (u i * , v i * ) is: The results showed that the derivative of the equation is shown in the form of implicit, so it is necessary to use numerical optimization is BHHH. BHHH procedure similar to that shown in the global model parameter estimation in which the iteration process stops when λ r+1 −λ r < ε. BHHH iteration equation is as follows.

The Similarity Model Test
Once the parameters were obtained good estimates for global and local models, the next step was testing the model in common use comparative value deviance global models and local models with a degree of freedom for the global model. The hypothesis is as follows.

Hypothesis 6 (H6). at least one
The criteria that reject the null hypothesis if F > F (α,d f 1 ,d f 2 ) where the degrees of freedom df 1 and df 2 global model is a local model-free interval and with a certain significance level (α). The deviance for GWBGR models can be obtained by looking ln L(ω GWBGR ) and ln L(Ω GWBGR ). ln L(ω GWBGR ) can be obtained by maximizing the likelihood function under the null hypothesis. The parameters were set under the null hypothesis The log-likelihood function under the null hypothesis can be defined as follows.
Parameter estimator on the location of i * under the null hypothesis could be obtained by using MLE.
The first derivative of the log-likelihood function for θ is: . The first derivative of the log-likelihood function for γ 1 is: The first derivative of the log-likelihood function for γ 2 is: The first derivative of the log-likelihood function for β 10 (u i * , v i * ) is: The first derivative of the log-likelihood function for β 20 (u i * , v i * ) is: The results of the first derivative of the log-likelihood function for each parameter are not closed form, so a numerical method is needed to obtain an estimator value for each parameter. Having obtained the results of parameter estimation, the next step is to define ln L(ω GWBGR ).

The Simultaneous Test
A simultaneous test was used to determine the significance of the regression coefficients with the test statistics used is G 2 . The hypothesis used for simultaneous testing is as follows.
The test statistic used is: Reject the null hypothesis if G 2 > χ 2 (α;v) with significance level α and v is the difference n(Ω) − n(ω).
The test statistic used is:

Descriptive Statistics and Testing Assumptions
The following are descriptive statistics for response and predictor variables for 34 regencies and cities in North Sulawesi, Gorontalo and Central Sulawesi. Table 1 shows that the average RIM and RMM in 2016 were 10.463 and 209.6. The lowest percentage of poor people was located in the city of Manado, namely 5.24% and the highest percentage was located in Gorontalo at 78.35%. The average percentage of obstetric complications in the three provinces, which amounted to 15.95%. The average provision of Fe3 in pregnant mothers was 76.55%. The percentage of first-time pregnant mothers under seventeen years of age most likely in Central Sulawesi Parigi Moutong regency and the smallest in the city of Manado. The average use of health facilities in three provinces is 65.52%. The regencies has the lowest percentage for clean and healthy life behavior is Talaud regency, namely 27.12%. Tests using the Kolmogorov-Smirnov show that the statistical value of each of Y1 and Y2 is 0.0985 (p-value: 0.864) and 0.0701 (p-value: 0.9919) in which p-value that is greater than the significance level of 5%, it can be concluded that the Y1 and Y2 follow the 3 parameter gamma distribution.
Before doing modeling necessary to examine the correlation between the response variables for the modeling requirements for the response variables of more than one is a correlation between these variables. T value = 3.2213, compared with t table , |t| = 3.2213 > t table (0.025; 32) = 2.0369, then the conclusion is to reject the null hypothesis means that there is a relationship between the response variables Y1 and Y2.
Another requirement before modeling RIM and RMM is that there is no multicollinearity case between predictor variables. If the VIF value is greater than 10 indicates a case of multicollinearity. Table 2 shows that all the VIF value is smaller than 2 so that it can be concluded that there were no cases of multicollinearity.

Modeling RIM and RMM with BGR Three-Parameters
Modeling the RIM and RMM by BGR was carried out to obtain the factors that affect RIM and RMM in the province of North Sulawesi, Gorontalo and Central Sulawesi. The BGR model parameter estimation results are shown in Tables 3 and 4. Then, perform the parameter hypothesis testing simultaneously using test statistic G 2 . Simultaneous test statistic values obtained 11,165.08 because the value is greater than χ 2 (0.05,12) = 3.94, then the null hypothesis is rejected it means at least one independent variable that significantly influences the response variables. Then followed by a partial test pass. The value of the test statistic Z is shown in Tables 3 and 4. With a significance level of 5% of the obtained, all predictor variables significantly influence the RIM and RMM for a value |Z| greater than Z 0.025 = 1.96.  Tables 3 and 4, it can set up a global model for RIM and RMM in the province of North Sulawesi, Gorontalo and Central Sulawesi, respectively as follows.

Modeling the RIM and RMM with GWBGR Three-Parameters Models
Modeling the RIM and RMM by using the model GWBGR preceded by determining the geographic location of each regency and city in the province of North Sulawesi, Gorontalo, and Central Sulawesi then determine the optimum bandwidth value using the GCV. The next step is to obtain a weighting matrix, where it is necessary to first calculate the Euclidean distance. The selected weighting function that is weighted with a fixed-function bisquare kernel with the optimum bandwidth value is 33.265.
To test whether the geographical significance of the models will be tested using a model of statistical similarity with the F test following hypotheses.
When the deviance for the global models is 11,165.08 and the deviance for the GWBGR models is 213,372.4, then the value of the F is 1.7791. Because the value of the F is greater than F 0.05,12.408 = 1.776, the decision obtained is to reject the null hypothesis with a significance level of 5%. Hence, it is concluded that there are significant differences between the model parameters and GWBGR BGR three-parameters. Hypothesis test simultaneously on a parameter to determine whether the predictor variables affect RIM and RMM simultaneously. G 2 test statistic obtained by 213,372.4116 where the value is greater than χ 2 (0.05,408) = 234.806 then the decision is to reject the null hypothesis, which is to say, simultaneously predictor variables affect the response variables with a significance level of 5%. It is next necessary partial testing to determine any predictor variables that affect the response variables. Tables 5 and 6 show parameter estimation results of the test statistic Z along RIM and RMM GWBGR model with three-parameters in one location Bolaang Mondow. The above results indicate that each predictor variable significant effect on RIM and RMM. This is evident from the value |Z| greater than the value Z table with a significance level of 5% so that the GWBGR model for three-parameters as follows.

Selection of Best Model
The AIC C value is a test statistic that can be used to determine which the best model among global models and local models in the modeling of RIM and RMM in the province of North Sulawesi, Gorontalo and Central Sulawesi.
The criterion for selecting the best model includes taking the model with the lowest AIC C value. Table 7 shows that the model GWBGR with a bisquare fixed kernel function is the best model to model RIM and RMM in the province of North Sulawesi, Gorontalo and Central Sulawesi in 2016.    In 1987, concerns regarding the impact of high RMM cases prompted WHO and other international organizations to establish "The safe motherhood initiative", which has six main pillars, namely: family planning, antenatal care, maternity care, postnatal care, postabortion care and sexually transmitted infection control, HIV and AIDS.

Conclusions
The parameter estimation of BGR and GWBGR models with three-parameters using MLE produces an equation that is not closed form, so it requires a numerical method to get an estimator value for each parameter. In this study, the numerical method used is BHHH. BGR and GWBGR modeling is applied to cases of RIM and RMM in the province of North Sulawesi, Gorontalo and Central Sulawesi. From the modeling results, RIM and RMM are influenced by the percentage of poor people, the percentage of obstetric complications treated, the percentage of pregnant mothers who received Fe3, the percentage of first-time pregnant mothers under seventeen years of age, the percentage of use of health facilities, and the percentage of households with clean and healthy lifestyle. The GWBGR modeling with fixed weight bisquare kernel produces two groups of regencies and cities based on the variables that significantly affect the RIM and two groups based on variables that significantly affect the RMM. The significant variables affecting RIM and RMM are the percentage of poor people, the percentage of obstetric complications treated, the percentage of pregnant mothers who received Fe3, the percentage of first-time pregnant mothers under seventeen years of age and the percentage of use of health facilities. The percentage of households with clean and healthy lifestyle does not significantly affect the RIM in the Bolaang Mongondow and Bone Bolango and does not significantly affect   In 1987, concerns regarding the impact of high RMM cases prompted WHO and other international organizations to establish "The safe motherhood initiative", which has six main pillars, namely: family planning, antenatal care, maternity care, postnatal care, postabortion care and sexually transmitted infection control, HIV and AIDS.

Conclusions
The parameter estimation of BGR and GWBGR models with three-parameters using MLE produces an equation that is not closed form, so it requires a numerical method to get an estimator value for each parameter. In this study, the numerical method used is BHHH. BGR and GWBGR modeling is applied to cases of RIM and RMM in the province of North Sulawesi, Gorontalo and Central Sulawesi. From the modeling results, RIM and RMM are influenced by the percentage of poor people, the percentage of obstetric complications treated, the percentage of pregnant mothers who received Fe3, the percentage of first-time pregnant mothers under seventeen years of age, the percentage of use of health facilities, and the percentage of households with clean and healthy lifestyle. The GWBGR modeling with fixed weight bisquare kernel produces two groups of regencies and cities based on the variables that significantly affect the RIM and two groups based on variables that significantly affect the RMM. The significant variables affecting RIM and RMM are the percentage of poor people, the percentage of obstetric complications treated, In 1987, concerns regarding the impact of high RMM cases prompted WHO and other international organizations to establish "The safe motherhood initiative", which has six main pillars, namely: family planning, antenatal care, maternity care, postnatal care, post-abortion care and sexually transmitted infection control, HIV and AIDS.

Conclusions
The parameter estimation of BGR and GWBGR models with three-parameters using MLE produces an equation that is not closed form, so it requires a numerical method to get an estimator value for each parameter. In this study, the numerical method used is BHHH. BGR and GWBGR modeling is applied to cases of RIM and RMM in the province of North Sulawesi, Gorontalo and Central Sulawesi. From the modeling results, RIM and RMM are influenced by the percentage of poor people, the percentage of obstetric complications treated, the percentage of pregnant mothers who received Fe3, the percentage of first-time pregnant mothers under seventeen years of age, the percentage of use of health facilities, and the percentage of households with clean and healthy lifestyle. The GWBGR modeling with fixed weight bisquare kernel produces two groups of regencies and cities based on the variables that significantly affect the RIM and two groups based on variables that significantly affect the RMM. The significant variables affecting RIM and RMM are the percentage of poor people, the percentage of obstetric complications treated, the percentage of pregnant mothers who received Fe3, the percentage of first-time pregnant mothers under seventeen years of age and the percentage of use of health facilities. The percentage of households with clean and healthy lifestyle does not significantly affect the RIM in the Bolaang Mongondow and Bone Bolango and does not significantly affect the RMM in the 17 regencies and cities. From these results, it is suggested for consideration at the next study that researchers use the mixed GWBGR model, get the data for response and predictor variables from the same source, and researchers can compare other test statistics for detecting multicollinearity.