In the section, Monte Carlo simulations are implemented to demonstrate the finite sample performance of the proposed model and estimation method. We also apply to analyse a real dataset example. In order to ensure the robustness and applicability, two kinds of matrices are chosen to investigate the spatial influence of the spatial weight matrix
W on the estimation effects. One is the Rook weight matrix as [
35], and the Rook weight matrix is generated according to Rook contiguity, which allocates the
n spatial units on a lattice of
(
) squares and finds the neighbours for unit with row normalizing. The other is the Case weight matrix as in [
59], we consider the spatial scenario with
r districts and
m members in each district, and each neighbour of a member in a district is given equal weight.
4.1. Simulation
The samples are generated from the following model:
where the covariate vectors
follows a bivariate normal distribution with mean vector
and covariance matrix
and
are bivariate,
and
for
,
and the error term
, where
F is the common cumulative distribution function of
. By subtracting the
th quantile, the error term is equal to zero at the
th quantile. The varying coefficient functions
with
and
. Furthermore, we chose three different values of spatial parameters
at three different quantile points
, two kinds of the Rook and Case weight matrix as the spatial weight matrix
W, respectively. The sample sizes are
for the Rook weight matrix, districts and members are
for the Case weight matrix.
We conducted each simulation with 1000 replications. For , we use a quadratic P-splines in which the number of knots are placed at equally spaced interval of the predictor variables and design hyper-parameters in our computation. The second-order random walk penalties are used for the Bayesian P-splines to approximate the unknown smooth functions. The unknown parameters are drawn from their respective prior distributions. The tuning parameter is used to control the resultant acceptable rate for parameter around 25% by incrementally increasing or decreasing value.
We generated 6000 sampled values following the proposed Gibbs sampler and deleted the first 3000 values as a burn-in period for each of the replications until the Markov Chains reach steady state. According to the last 3000 values, we calculate the corresponding means across 1000 replications for the posterior mean (Mean), standard error (SE) and 2.5th and 97.5th percentiles of the parameters, namely the 95% posterior credible intervals (95% CI), which are defined by the posterior probability of the parameters falling into the intervals is 95% based on the highest posterior density.
We also computed the standard derivations (SD) of the estimated posterior means to compare them with the means of the estimated posterior SE. From the model (
19), LeSage and Pace [
60] suggested scalar summary measures for the marginal effects, which are given by
for
. The direct effects are labeled as the average of the diagonal elements. The average of either the row sums or the column sums of the non-diagonal elements are used as the indirect effects, and the total effects are the sum of the direct and indirect effects.
To check the convergence of the MCMC algorithm, five different Markov Chains corresponding to different starting values have been ran through the Gibbs sampler to perform each replication.
Figure 1 displays the sampled traces of parts of the unknown quantities, including model parameters and fitting functions on grid points. It is clear that the five parallel sequences mix reasonably well. We further calculate the “potential scale reduction factor”
for all unknown parameters and varying coefficient functions on 10 selected grid points based on the five parallel sequences.
Figure 2 shows the values of
after iterating 3000 times. We observe that all the values of
are less than 1.2 following the suggestion of Gelman and Rubin [
61] after 3000 burn-in iterations, which is sufficient for convergence.
In order to investigate the finite sample performance of varying coefficient functions, the variability measures of the mean absolute deviation errors (MADE) and global mean absolute deviation errors (GMADE) are used to measure the estimation performance. MADE and GMADE are defined as
at 100 fixed grid points
that are equally-spaced chosen from interval
.
Figure 3a displays the boxplots of the MADE and GMADE values with sample size
and
at
quantile point. Based on the Rook weight matrix on the left three panels, the medians are
,
and
. Based on the Case weight matrix on the right three panels, the medians are
,
and
.
Figure 3b shows the boxplots of the MADE and GMADE values with sample size
and
at
quantile point. Based on the Rook weight matrix, the medians are
,
and
on the left three boxplots. Based on the Case weight matrix , the medians are
,
and
on the right three boxplots. We can see that the MADE and GMADE values not only decrease when the number of
n increase but also become smaller under the Case weight matrix than the Rook weight matrix, meaning the varying coefficient functions become more accurate when increasing the sample size with application of the Case weight matrix. This shows that the proposed model and estimation method with both the Rook weight matrix and the Case weight matrix in the finite sample can obtain reasonable estimation and good performance.
Table 1 and
Table 2 summarize the estimation results. The parameter estimates are quite different at three quantiles of the response distributions. Under the same spatial weight matrix, the accuracy of the results improves with the increasing of the sample sizes. We can see that the means of the unknown estimators are close to the respective true values, and the average values of the SE are close to the corresponding SD, indicating that the parameter estimates and the standard errors are more precise. For the parameter
under the same sample sizes, we find the SE and SD of parameter
with the Case weight matrix are slightly better than that with the Rook weight matrix. In addition, the general pattern from the estimates reported in
Table 1 and
Table 2 is that all estimators impose relatively larger bias on the total effect estimates when there is strong positive spatial dependence for similar sample sizes. When we repeat the aforementioned experiences with different starting values, the estimation results are similar, all of which indicate that the proposed Gibbs sampler performs quite well.
Figure 4 compares the estimation results of varying coefficient functions at different quantiles, along with its 95% pointwise posterior credible intervals of
and
from a typical sample under
and
, respectively. The typical sample is selected in such a way that its MADE value is equal to the median in the 1000 replications. We can see that the three fitting curves are fairly close to the solid curve, and the corresponding credible bandwidth is narrow. With the increasing of the sample sizes, the gaps between the fitting curves and the true value become short. There also exist visible differences at different quantiles of the response distributions. It illustrates that the varying coefficient function estimation procedure works well for small samples.
We compare the performance of the Bayesian quantile regression (BQR) estimator in this paper to the instrumental variable quantile regression (IVQR) estimator in Dai et al. [
12] with two examples.
Example 1. The model is given as followswhere , , and , , F is the common cumulative distribution function of , and the th quantile of random error is centred to zero. and are generated from and , are bivariate. and are generated independently from and . Table 3 summarizes the comparison results of QR, IVQR and BQR estimators with a homoscedastic error term. Example 2. The model is given as followswhere , , and , , F is the common cumulative distribution function of , and the th quantile of random error is equal to zero. and are generated from , and , are bivariate. and are generated independently from and . Table 4 summarizes the comparison results of QR, IVQR and BQR estimators with a heteroscedastic error term. The spatial weight matrix
is generated based on mechanism that
for
, and then standardized transformation is applied to convert the matrix W to have row-sums of unit [
12]. After repeating the estimation procedure 1000 times for each case, we calculate the Bias and RMSE between the parameter estimates and true values, the MADE of the estimation accuracy of the varying coefficient functions.
Table 3 and
Table 4 report the results of QR, IVQR and BQR corresponding to example 1 and example 2. It can be seen that the influence of explanatory variables on the response is quite different at different quantiles of the response distributions. When the sample sizes enhance, all the bias, RMSE and MADE of the estimators will decrease significantly. Comparing with the three methods QR, IVQR and BQR, the BQR estimator can obtain more robust results in the same condition with less bias, RMSE and MADE. We think that BQR algorithm is superior to QR and IVQR, although the later can also achieve reasonable estimations.
4.2. Application
As an application of the proposed model and methods to a real data example, we use the well-known Sydney real estate data with detailed description in [
62]. The data set contains 37,676 properties sold in the Sydney Statistical Division (an official geographical region including Sydney) in the calendar year of 2001, which is available from HRW package in R. We focus on the last week of February only to avoid the temporal issue including 538 properties.
In this application, the house price (Price) is explained by four variables, which are the distance from house to the nearest coastline location in kilometres (DC), distance from house to the nearest main road in kilometres (DR), inflation rate measured as a percentage (IR) and average weekly income (Income). The DC and DR have linear effects on the response Price, while the IR and Income have nonlinear effects on the response Price. Moreover, we make Price and DC logarithmic transformation to avoid the trouble caused by big gaps in the domain. In addition, Income is transformed so that the marginal distribution is approximately
. Therefore, the following partially linear varying coefficient spatial autoregressive model will be developed:
where the response variable
,
,
,
,
. Regarding the choice of the weight matrix, according to the practice in Sun et al. [
10], we use the Euclidean distance in terms of any two houses to calculate the spatial weight matrix
W. The location is represented with longitude and latitude, denoted as
. The spatial weight
is
For this dataset, we adopt quadratic P-splines and hyper-parameters for . The tuning parameter is used to control the acceptable rate for updating around 25%.
We run the proposed Gibbs sampler five times with different starting values and generate 10,000 sampled values following a burn-in of 20,000 iterations in each run. Traces of parts of the unknown quantities are plotted in
Figure 5, and the five parallel sequences aggregate very well. Based on the five parallel sequences, we further calculate the “potential scale reduction factor”
, which is plotted in
Figure 6. It is clear that all the values of
are less than 1.2 after 20,000 burn-in iterations. The proposed estimators can realize excellent convergence effects applying to the actual data.
Table 5 lists the estimated parameters together with their standard errors and 95% posterior credible intervals. It shows that the estimation of the spatial coefficient
is 0.57 with the standard deviation SE
at
quantile, which means that there exists positive and significant spatial spillover effects for the housing prices of Sydney real estate. However, the spatial coefficient decreases with the increase of quantiles, when the house prices are lower, the spatial effects and the interaction between different regions become stronger. The coefficients of the two covariates
and
are
and
at
quantile, they also have promotional effects on housing prices at the other two quantiles,
and
will play an important positive role with the house prices rising because the two parameters present an increasing trend at higher quantiles.
Figure 7 presents the estimated varying coefficient functions together with its 95% pointwise posterior credible intervals, which includes three quantiles
and
by a dotted line, star line and forked line, respectively. The curves totally show an upward trend, especially when
u becomes larger, the curves rise up more. This shows that the effect of covariate Income on the response has a U-shaped nonlinear relationship. More specifically, when the quantile at
, the varying coefficient function
is greater than the other two, meaning that Income has a significant promoting influence in areas with higher housing prices. The empirical result confirms the robustness and practicability of the Bayesian P-splines method.