Next Article in Journal
Density Functional Theory Study of Substitution Effects on the Second-Order Nonlinear Optical Properties of Lindquist-Type Organo-Imido Polyoxometalates
Next Article in Special Issue
Modeling Extreme Values Utilizing an Asymmetric Probability Function
Previous Article in Journal
Evaluation of Mandibular Growth and Symmetry in Child with Congenital Zygomatic-Coronoid Ankylosis
Previous Article in Special Issue
Estimation and Prediction for Nadarajah-Haghighi Distribution under Progressive Type-II Censoring
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bayesian Analysis of Partially Linear Additive Spatial Autoregressive Models with Free-Knot Splines

School of Mathematics and Statistics, Fujian Normal University, Fuzhou 350117, China
*
Author to whom correspondence should be addressed.
Symmetry 2021, 13(9), 1635; https://doi.org/10.3390/sym13091635
Submission received: 5 July 2021 / Revised: 28 August 2021 / Accepted: 2 September 2021 / Published: 6 September 2021
(This article belongs to the Special Issue Probability, Statistics and Applied Mathematics)

Abstract

:
This article deals with symmetrical data that can be modelled based on Gaussian distribution. We consider a class of partially linear additive spatial autoregressive (PLASAR) models for spatial data. We develop a Bayesian free-knot splines approach to approximate the nonparametric functions. It can be performed to facilitate efficient Markov chain Monte Carlo (MCMC) tools to design a Gibbs sampler to explore the full conditional posterior distributions and analyze the PLASAR models. In order to acquire a rapidly-convergent algorithm, a modified Bayesian free-knot splines approach incorporated with powerful MCMC techniques is employed. The Bayesian estimator (BE) method is more computationally efficient than the generalized method of moments estimator (GMME) and thus capable of handling large scales of spatial data. The performance of the PLASAR model and methodology is illustrated by a simulation, and the model is used to analyze a Sydney real estate dataset.

1. Introduction

Spatial econometrics models are frequently proposed to analyze spatial data that arise in many disciplines such as urban, real estate, public, agricultural, environmental economics and industrial organizations. These models address relationships across geographic observations caused by spatial autocorrelation in cross-sectional or longitudinal data. Spatial econometrics models have a long history in both econometrics and statistics. Early developments and relevant surveys can be found in Cliff and Ord [1], Anselin [2], Case [3], Cressie [4], LeSage [5,6], Anselin and Bera [7].
Among spatial econometrics models, spatial autoregressive (SAR) models [1] have gained much attention by theoretical econometricians and applied researchers. Many approaches have been used to estimate the SAR models, which include the maximum likelihood estimator (MLE) [8], the generalized method of moment estimator (GMME) [9], and the quasi-maximum likelihood estimator (QMLE) [10]. However, these methods mainly focused on parametric SAR models, which are frequently assumed to be linear, few researchers have explicitly examined non-/semi-parametric SAR models. Indeed, it has been confirmed that a lot of economic variables exhibit highly nonlinear relationships on the dependent variables [11,12,13]. Neglecting the latent nonlinear functional forms often results in an inconsistent estimation of the parameters and misleading conclusions [14].
Although many empirical studies and econometric analyses applying the parametric SAR models ignore latent nonlinear relationships, several nonlinear forms [15,16,17,18] have been considered. Nevertheless, the nonlinear parametric SAR models can at most supply certain protection against some specific nonlinear functional forms. Since the nonlinear function is unknown, it is unavoidable to assume the risk of misspecifying the nonlinear function. As nonparametric techniques advance, the advantage of nonparametric SAR models are often used to model nonlinear economic relationships. However, nonparametric components are only suitable for low dimensional covariates, otherwise the “curse of dimensionality” [19] problem is often encountered. Some nonparametric dimension-reduction tools have been considered to address this problem, for example, single-index model [20], partially linear model [21], the additive model [22], varying-coefficient model [23], among others. In recent years, many researchers have started using the advantage of semiparametric modeling in spatial econometrics. For example, Su and Jin [14] proposed the QMLE for semiparametric partially linear SAR models; Su [24] discussed GMME of semiparametric SAR models; Chen et al. [25] studied a Bayesian method for the semiparametric SAR models; Wei and Sun [26] considered GMME for the space-varying coefficients of a spatial model; Krisztin [27] investigated a novel Bayesian semiparametric estimation for the penalized spline SAR models; Krisztin [28] presented a genetic-algorithms for a nonlinear SAR model; Du et al. [29] established GMME of PLASAR models; Chen and Cheng [30] developed a GMME of a partially linear additive spatial error model.
Semiparametric models have received much attention from both econometrics and statistics owing to the explanatory of the parameters and the flexibility of nonparameters. The partially linear additive (PLA) model is probably one of the most popular among the various semiparametric models. As they can not only avoid the “curse of dimensionality” phenomenon encountered in nonparametric regression but also provide a more flexible structure than the generalized linear models. As a result, the PLA models provide good equilibrium between flexibility of the additive model and interpretation of the partially linear model. Many researchers have considered many approaches to analyze such models: local linear method [31], spline estimation [32,33,34], quantile regression [35,36,37,38], variable selection [39,40,41,42], etc. Characterizing the flexibility of nonparametric forms and attempting to explain the potential nonlinearity of PLASAR models are unique challenges faced by analysts of spatial data.
Combining PLA models with SAR models, we consider a class of PLASAR models for spatial data to capture the linear and nonlinear effects between the related variables in addition to spatial dependence between the neighbors in this article. We specify the prior of all unknown parameters, which led to a proper posterior distribution. The posterior summaries are obtained via MCMC tools. We develop an improved Bayesian method with free-knot splines [43,44,45,46,47,48,49,50,51], along with MCMC techniques to estimate unknown parameters, use a spline approach to approximate the nonparametric functions, and design a Gibbs sampler to explore the joint posterior distributions. Treating the number and the positions of knots as random variables can make the model spatially adaptive is an attractive feature of Bayesian free-knot splines [45,46]. In order to improve the rapidly-convergent performance of our algorithm, we further modify the movement step of Bayesian free-knots splines such that all knots can be repositioned in each iteration instead of only one knot moving. Finally, the performance of the PLASAR model and methodology is illustrated by a simulation, and they are used to analyze real data.
The rest of this paper is organized as follows. In Section 2, we propose the PLASAR model to analyze spatial data and discuss the proposed model’s identifiability condition, then acquire the likelihood function by fitting the nonparametric functions with a Bayesian free-knots splines approach. To provide a Bayesian framework, we specify the priors for the unknown parameters, derive the full conditional posterior distributions of the unknown parameters, modify the movement step of the Bayesian free-knots splines approach to accelerate the convergent performance of our algorithm, and describe the detailed sampling algorithm in Section 3. The applicability and practicality of the PLASA model and methodology for spatial data are evaluated by a simulation study, and the model is used to analyze a real dataset in Section 4. Section 5 concludes the paper with a summary.

2. Methodology

2.1. Model

We begin with the PLASAR model that is defined as
y i = ρ l = 1 n w i l y l + x i T α + j = 1 p g j ( z i j ) + ε i , i = 1 , , n ,
where x i = ( x i 1 , , x i q ) T and z i = ( z i 1 , , z i p ) T are covariate vectors, y i is a response variable, w i l is a specified constant spatial weight, g j ( · ) is unknown univariate nonparametric function for j = 1 , , p , α = ( α 1 , , α q ) T is q × 1 vector of the unknown parameters, the unknown spatial parameter ρ reflects the spatial autocorrelation between the neighbors with stability condition | ρ | < 1 , and ε i ’s are mutually independent and identically distributed normal with zero mean and variance σ 2 . In order to ensure model identifiability of the nonparametric function, it is often assumed that the condition E [ g j ( z j ) ] = 0 for j = 1 , , p .

2.2. Likelihood

We plan on approximating unknown functions g j ( · ) in (1) by free-knot splines for j = 1 , , p . Assuming that g j ( · ) has a polynomial spline of degree m j with k j order interior knots ξ j = ( ξ j 1 , , ξ j k ) T with a j < ξ j 1 < < ξ j k < b j , i.e.,
g j ( u j ) = l = 1 K j B j l ( u j ) β j l = B j T ( u j ) β j u j [ a j , b j ] ,
where K j = 1 + m j + k j , the vector of spline basis B j ( u j ) = ( B j 1 ( u j ) , , B j K j ( u j ) ) T is determined by the knot vector ξ j , the spline coefficients β j = ( β j 1 , , β j K j ) T is a K j × 1 vector, and
a j = min 1 i n { z i j } and b j = max 1 i n { z i j }
are boundary knots for j = 1 , , p . Let B j = ( B 1 ( z 1 j ) , , B n ( z n j ) ) T and 1 = ( 1 , , 1 ) T . To achieve identification, we set i = 1 n l = 1 K j B j l ( z i j ) β j l = 0 , which is written as 1 T B j β j = 0 . Denote Q j = 1 T B j , then the constraint becomes Q j β j = 0 .
It follows that the model (1) can be equivalent to
y i = ρ l = 1 n w i l y j + x i T α + j = 1 p B j T ( u j ) β j + ε i = ρ l = 1 n w i l y j + x i T α + B i T ( u ) β + ε i , i = 1 , , n ,
where β = ( β 1 T , , β p T ) T and B i T ( u ) = ( B 1 T ( u 1 ) , , B p T ( u p ) ) T . Then the matrix form of the model (1) can be represented as
y = ρ W y + x T α + B T ( u ) β + ε ,
where x = ( x 1 , , x n ) T , y = ( y 1 , , y n ) T , ε = ( ε 1 , , ε n ) T , W = ( w i l ) is an n × n specified constant spatial wight matrix, K = j = 1 p K j , and B T ( u ) is an n × K matrix with B i T ( u ) as its ith row.
The likelihood function for the PLASAR model is proportional to
L ( α , β , k , ξ , σ 2 , ρ | y , x , z ) σ n | I n ρ W | exp 1 2 σ 2 [ y ρ W y x T α B T ( u ) β ] T [ y ρ W y x α B T ( u ) β ] = σ n | A ( ρ ) | exp 1 2 σ 2 [ A ( ρ ) y x α B T ( u ) β ] T [ A ( ρ ) y x T α B T ( u ) β ] σ n | A ( ρ ) | exp 1 2 σ 2 [ A ( ρ ) y B ( x , u ) θ ] T [ A ( ρ ) y B ( x , u ) θ ] ,
where x = ( x 1 , , x n ) T , z = ( z 1 , , z n ) T , θ = ( α T , β T ) T is ( q + K ) × 1 vector of regression coefficient, B ( x , u ) = ( x , B T ( u ) ) is an n × ( q + K ) matrix, A ( ρ ) = I n ρ W , and I n is an identity matrix of order n.

3. Bayesian Estimation

In this section, we consider a Bayesian free-knots splines approach with MCMC techniques to analyze the PLASAR model. We begin with the specification of the prior distributions of all unknown parameters, then the derivations of posterior distributions and the narration of the detailed sampling scheme for all of the unknown parameters. Meanwhile, we modify the movement step of Bayesian free-knots splines approach so that all the knots can be repositioned in each iteration.

3.1. Priors

As we will consider a Bayesian approach with free-knots splines to analyze the PLASAR models, all unknown parameters are assigned prior distributions. Note that besides regression coefficient vectors θ , the spatial autocorrelation coefficient ρ and the quantities σ 2 , the number of knots k = ( k 1 T , , k p T ) T and location of knots ξ = ( ξ 1 T , , ξ p T ) T also need prior distributions in the sense that they are random variables in the Bayesian approach with free-knot splines. We avoid the use of improper prior distributions to prevent improper joint posterior distributions.
For j = 1 , , p , we followed Poon and Wang [49] by puting a Poisson prior with mean λ j for number k j of the knots
π ( k j ) = λ k j k j ! e λ j
and a conditional flat prior for knot location ξ j
π ( ξ j k j ) = k j ! ( b j a j ) k j Δ j ,
where Δ j = I a j = ξ j 0 < ξ j 1 < < ξ j k j < ξ j , k j + 1 = b j , a j and b j are defined in (3).
We set a conjugate normal inverse-gamma prior for the unknown parameters ( θ , σ 2 ) , which is a composite of inverse-Gamma prior distributions for σ 2
π ( σ 2 ) ( σ 2 ) r 0 2 1 exp s 0 2 2 σ 2
where r 0 and s 0 2 are hyperparameters; a conditional normal prior distribution with mean vector 0 and covariance matrix τ 0 σ 2 I q for α
π ( α | σ 2 , τ 0 ) ( 2 π τ 0 σ 2 ) q 2 exp α T α 2 τ 0 σ 2 ,
and a conditional normal prior distribution with mean vector 0 and covariance matrix τ j σ 2 I K j for β j with the constraint Q j β j = 0 as follows:
π ( β j | k j , ξ j , τ j , σ 2 ) ( 2 π τ j σ 2 ) K j 2 exp β j T β j 2 τ j σ 2 I { Q j β j = 0 }
for j = 1 , , p . In order to improve the robustness of our method, we choose an inverse-gamma prior
π ( τ 0 ) τ 0 r τ α 0 2 1 exp s τ α 0 2 2 τ 0 and π ( τ j ) τ j r τ β j 0 2 1 exp s τ β j 0 2 2 τ j
for j = 0 , 1 , , p , where r τ 0 and s τ 0 2 are pre-specified hyper-parameters. Throughout this article we set r 0 = s 0 2 = 1 to obtain a Cauchy distribution of σ 2 and assign r τ α 0 = r τ β j 0 = 1 and s τ α 0 2 = s τ β j 0 2 = 0.005 to acquire a highly dispersed inverse gamma prior on τ j for j = 0 , 1 , , p .
In addition, we follow LeSage and Pace [52] by eliciting a uniform prior U ( λ min 1 , λ max 1 ) for the spatial autocorrelation coefficient ρ
π ( ρ ) 1 ,
where λ max and λ min are the maximum and minimum eigenvalues of the standardized spatial weight matrix W, respectively.
Therefore, the joint priors of all of the quantities are defined as
π ( ρ , α , β , k , ξ , σ 2 , τ ) = π ( ρ ) π ( σ 2 ) π ( τ 0 ) π ( α | σ 2 , τ 0 ) i = 1 p π ( β j | k j , ξ j , τ j , σ 2 ) π ( k j ) π ( ξ j | k j ) π ( τ j ) ,
where τ = ( τ 0 , τ 1 , , τ p ) is a hyperparameters vector. For computational convenience, we have treated the hyperparameter vector τ as a unknown parameter vector.

3.2. The Full Conditional Posterior Distributions of Unknown Quantities

Since the joint posterior distribution of the quantities is very complicated, it is difficult to generate samples directly. To solve this problem, we derive the full conditional posterior distributions of unknown quantities, modify the movement step of Bayesian free-knots splines to speed up the convergence, and describe the detailed sampling method in our algorithm.
It follows from the likelihood function (4) and the joint priors (5) that the conditional posterior distribution of ρ given the remaining unknown parameters is proportional to
p ( ρ | y , x , z , α , β , k , ξ , σ 2 , τ ) | A ( ρ ) | exp 1 2 σ 2 [ A ( ρ ) y x T α B T ( u ) β ] T [ A ( ρ ) y x T α B T ( u ) β ] .
It is not easy to directly simulate from (6), which does not have the form of any standard density function. Therefore, we prefer the Metropolis–Hastings algorithm [53,54] to solve this difficulty: draw ρ * from a truncated Cauchy distribution with location ρ and scale σ ρ on ( 1 , 1 ) , where σ ρ is treated as a tuning parameter; and accept the candidate value ρ * with probability
min 1 , p ( ρ * | x , y , z , α , β , k , ξ , σ 2 , τ ) p ( ρ | x , y , z , α , β , k , ξ , σ 2 , τ ) × C ρ ,
where
C ρ = arctan [ ( 1 ρ ) / σ ρ ] arctan [ ( 1 ρ ) / σ ρ ] arctan [ ( 1 ρ * ) / σ ρ ] arctan [ ( 1 ρ * ) / σ ρ ] .
From likelihood function (4) and priors (5), we can see that given ( ρ , τ ) , the joint posterior of ( α , β , k , ξ , σ 2 ) is given by
p ( α , β , k , ξ , σ 2 | x , y , z , ρ , τ ) σ n exp 1 2 σ 2 [ A ( ρ ) y x T α B T ( u ) β ] T [ A ( ρ ) y x T α B T ( u ) β ] × σ r 0 q 2 exp s 0 2 2 σ 2 α T α 2 τ 0 σ 2 × j = 1 p λ j b j a j k j Δ j × j = 1 p ( 2 π τ j σ 2 ) K j 2 exp β j T β j 2 τ j σ 2 I { Q j β j = 0 } σ n r 0 K q 2 exp 1 2 σ 2 [ A ( ρ ) y B ( x , u ) θ ] T [ A ( ρ ) y B ( x , u ) θ ] × exp s 0 2 2 σ 2 θ T diag { τ 1 } θ 2 σ 2 I { Q β = 0 } × j = 1 p λ j τ j b j a j k j Δ j | Ξ | 1 2 S 2 + s 0 2 n + r 0 2 j = 1 p λ j τ j 1 2 b j a j k j Δ j × S 2 + s 0 2 n + r 0 2 ( σ 2 ) n + r 0 2 1 exp S 2 + s 0 2 2 σ 2 × ( 2 π σ 2 ) K + q 2 | Ξ | 1 2 exp 1 2 σ 2 ( θ θ ^ ) T Ξ ( θ θ ^ ) I { Q β = 0 } ,
where θ = ( α T , β T ) T , diag { τ 1 } = diag { τ 0 1 I q , τ 1 1 I K 1 , , τ p 1 I K p } , Q = ( Q 1 , , Q p ) , Ξ = diag { τ 1 } + B T ( x , u ) B ( x , u ) , θ ^ = Ξ 1 B T ( x , u ) A ( ρ ) y , and S 2 = y T A ( ρ ) T A ( ρ ) y θ ^ T Ξ θ ^ , which gives rise to a marginal posterior distribution
p ( k , ξ | y , x , z , ρ , τ ) | Ξ | 1 2 S 2 + s 0 2 n + r 0 2 j = 1 p λ j τ j 1 2 b j a j k j Δ j .
It is easy to see from (8) that
p ( k j , ξ j | y , x , z , ρ , α , k j , ξ j , β j , τ ) | Ξ j | 1 2 S j 2 + s 0 2 n + r 0 2 λ j τ j 1 2 b j a j k j Δ j , j = 1 , , p .
where Ξ j = τ j 1 I K j + B j ( u j ) B j T ( u j ) , y j = A ( ρ ) y x T α B j T ( u j ) β j , β ^ j = Ξ j 1 B j ( u j ) y j , S j 2 = y j T y j β ^ j T Ξ j β ^ j and k j , ξ j , β j are k , ξ , β with k j , ξ j , β j excluded, respectively.
It follows from (7) that the approach of composition [55] can be used to generate σ 2 from a conditional inverse-gamma posterior
p ( σ 2 | y , x , z , ρ , α , k , ξ , β , τ ) S 2 + s 0 2 n + r 0 2 ( σ 2 ) n + r 0 2 1 exp S 2 + s 0 2 2 σ 2
and to sample θ from a conditional normal posterior
p ( θ | y , x , z , ρ , k , ξ , σ 2 , τ ) ( 2 π σ 2 ) K + q 2 | Ξ | 1 2 exp 1 2 σ 2 ( θ θ ^ ) T Ξ ( θ θ ^ ) I { Q β = 0 } .
It follows from (11) that
p ( α | y , x , z , ρ , k , ξ , β , σ 2 , τ ) ( 2 π σ 2 ) q 2 | Ξ 0 | 1 2 exp 1 2 σ 2 ( α α ^ ) T Ξ 0 ( α α ^ ) ,
where Ξ 0 = τ 0 1 I q + x T x , y 0 = A ( ρ ) y B T ( u ) β , α ^ = Ξ 0 1 x T y 0 , and
p ( β j | y , x , z , ρ , α , k , ξ , β j , σ 2 , τ ) ( 2 π σ 2 ) K j 2 | Ξ j | 1 2 exp 1 2 σ 2 ( β j β ^ j ) T Ξ j ( β j β ^ j ) I { Q j β j = 0 } ,
for j = 1 , , p . To achieve identification, we focus on the constraint Q j β j = 0 , which should be imposed on β j . According to Panagiotelis and Smith [56], drawing β j from (13) is equivalent to drawing β j from a normal distribution with mean vector β ^ j and covariance matrix Ξ j , then β j is transformed to β j by
β j = β j Ξ j * Q j T ( Q j Ξ j * Q j T ) 1 Q j β j .
As it is convenient to sample ( σ 2 , α , β ) from the conditional posterior (10), (12) and (13), we concentrate on sampling from (9). A sampling method is applied, in which the original Bayesian free-knots spline [43,44,45,46,47,48,49,50,51] is used as a reversible-jump sampler [57]. It includes three types of movement: the deletion, the addition and the movement of only one knot [48]. We keep the first two move-types unchanged but improve the movement step through the hit-and-run algorithm [58] so that all the knots can be repositioned in each iteration instead of only one knot: for j = 1 , , p select a k j dimension direction vector c j = ( c j 1 , , c j k ) T randomly, and define
Ω j = ω j : ξ j * = ξ j + c j with a j < ξ j i * < b j , i = 1 , , k j = ω j 1 , ω j 2 ;
generate ω j from a Cauchy distribution with location 0 and scale σ ξ j truncated on ω j 1 , ω j 2 , where σ ξ j acts as a tuning parameter; assign ξ j * = ξ j + c j and reorder all of the knots. The proposed number and positions of knots are finally accepted with probability
min 1 , A j | Ξ j | | Ξ j * | 1 2 × S j 2 + s 0 2 S j * 2 + s 0 2 n + r 0 2 ,
where Ξ j * and S j * 2 correspond to Ξ j and S j 2 in the candidate posterior, respectively, and the factor
A j = arctan ( ω j 2 / σ ξ j ) arctan ( ω j 1 / σ ξ j ) arctan [ ( ω j 2 ω j ) / σ ξ j ] arctan [ ( ω j 1 ω j ) / σ ξ j ] .
It is evident that the posterior of hyperparameter τ j is a conditional inverse-gamma distribution
p ( τ 0 | σ 2 , α ) τ q + r τ α 0 2 1 exp s τ α 0 2 + α T α / σ 2 2 τ 0
and
p ( τ j | β j , k j , ξ j , σ 2 ) τ j K j + r τ β j 0 2 1 exp s τ β j 0 2 + β j T β j / σ 2 2 τ j , j = 1 , , p ,
which can be simulated directly from (15) and (16).

3.3. Sampling Scheme

The Bayesian estimate of Θ = { ρ , α , k , ξ , β , σ 2 , τ } is obtained by observations generated from the posterior of all unknown quantities by running the Gibbs sampler. Moreover, simulating β j from (13) is challenging and nonstandard, and the parameter space on the constraint Q j β j = 0 for j = 1 , , p . According to Panagiotelis and Smith [56], it is equivalent that β j is transformed to β j by (14). The MCMC sampling algorithm (Algorithm 1) is described in the following manner.
Algorithm 1 The MCMC sampling algorithm.
Input: Samples { ( x i , y i , z i ) } i = 1 , , n .
Initialization: Initialize Θ ( 0 ) = { ρ ( 0 ) , α ( 0 ) , k ( 0 ) , ξ ( 0 ) , β ( 0 ) , σ 2 ( 0 ) , τ ( 0 ) } in the MCMC algorithm, where the unknown parameters are generated from the priors, respectively.
MCMC iterations: Given the current state of Θ ( t 1 ) successively, draw Θ ( t ) from p ( Θ | x , y , z ) , for t = 1 , 2 , 3 , The detailed MCMC sampling cycles are outlined in the following manner.
(a) Generate ρ ( t ) from p ( ρ | x , y , z , θ ( t 1 ) , α ( t 1 ) , k ( t 1 ) , ξ ( t 1 ) , σ 2 ( t 1 ) , τ ( t 1 ) ) ;
(b) Generate σ 2 ( t ) from p ( σ 2 | x , y , z , α ( t ) , k ( t ) , ξ ( t ) , ρ ( t ) , τ ( t 1 ) ) ;
(c) Generate α ( t ) from p ( α | x , y , z , ρ ( t ) , k ( t 1 ) , ξ ( t 1 ) , β ( t 1 ) , τ ( t 1 ) ) ;
(d) Generate ( k j ( t ) , ξ j ( t ) ) from p ( k j ( t ) , ξ j ( t ) | x , y , z , ρ ( t ) , k j ( t 1 ) , ξ j ( t 1 ) , β j ( t 1 ) τ ( t 1 ) ) for j = 1 , , p ;
(e) Generate β j ( t ) from p ( β j ( t ) | x , y , z , ρ ( t ) , α ( t ) , k ( t ) , ξ ( t ) , β j ( t 1 ) , σ 2 ( t ) , τ ( t 1 ) ) ,      and adjust β j ( t ) according to (14) for j = 1 , , p ;
(f) Generate τ j ( t ) from p ( τ j | k ( t ) , ξ ( t ) , β j ( t ) , σ 2 ( t ) ) for j = 1 , , p ;
(g) Generate τ 0 ( t ) from p ( τ 0 | α ( t ) , σ 2 ( t ) ) .
Output: The MCMC sampling from the conditional posteriors of { Θ ( t ) } t = 1 , 2 , 3 , .

4. Empirical Illustrations

We demonstrate the performance of the PLSISAR model and methodology by a simulation and use them to analyze a real data. We set the Rook weight matrix [2] and the Case weight matrix [3] to examine the influence of the spatial weight matrix W. The Rook weight matrix is sampled from Rook contiguity in [59] by randomly allocating the n spatial units on a lattice of m × m ( n ) squares, finding the neighbors for the unit, and then row normalizing. Meanwhile, we generated the Case weight matrix from the spatial scenario W = I r T m in [3] with m members in a district and r districts, and each neighbor of a member in each district given equal weight [10], where ⨂ is the Kronecker product, T m = ( 1 / ( m 1 ) ) ( 1 m 1 m T I m ) and 1 m = ( 1 , , 1 ) T is an m-dimensional vector.

4.1. Simulation

Consider the following PLSISAR models:
y i = ρ l = 1 n w i l y l + x i T α + g 1 ( z i 1 ) + g 2 ( z i 2 ) + ε i , i = 1 , , n ,
where x i = ( x i 1 , x i 2 ) T follows a bivariate standard normal distribution, z i = ( z i 1 , z i 2 ) T is a bivariate vector, where z i 1 and z i 2 are mutually independent and follow uniform distributions on ( 1 , 1 ) and ( 0 , 1 ) , respectively. The nonparametric functions g 1 ( z 1 ) = sin ( π z 1 ) and g 2 ( z 2 ) = 4 z 2 ( 1 z 2 2 ) 1 , ε i N ( 0 , σ 2 ) , the parameters are assumed as α = ( 1 , 1 ) T and two cases of variance σ 2 = { 0.25 , 0.75 } , respectively. We consider three different cases of spatial parameters ρ = { 0.2 , 0.5 , 0.7 } , which represent the spatial dependence of the response from weak to strong. The sample size of the Case weight matrix and the Rook weight matrix is ( r , m ) = { ( 20 , 5 ) , ( 80 , 5 ) } and n = { 100 , 400 } , respectively.
In our computation, we run each simulation with 1000 replications, adopt a quadratic B-spline and set hyper-parameters ( r 0 , s 0 2 , r τ α 0 , s τ α 0 2 ) = ( 1 , 1 , 1 , 0.005 ) and ( r τ β j 0 , s τ β j 0 2 ) = ( 1 , 0.005 ) for j = 1 , , p . The initial state of the Markov chain of all unknown parameters is selected as follows. All unknown parameters are sampled from the respective priors by gradually decreasing or increasing the use of tuning parameters σ ρ and σ ξ j for j = 1 , , p so that the acceptable rates are about 25%. For each replication, we generate 6000 sampled values and then delete the first 2000 sampled values as a burn-in period by running our MCMC sampling. Based on the last 4000 sampled values, we compute the corresponding average of 1000 replications as the posterior mean (mean), the 95% posterior credible intervals (95% CI), and standard error (SE). In addition, the standard derivations (SD) of the estimated posterior mean are calculate to compare them with the mean of the estimated posterior SE.
We evaluate the performance of nonparametric estimators by the integrated squared bias (Bias), the root integrated mean squared errors (SSE), the mean absolute deviation errors (MADE)
Bias ( g ^ j ) = E g ^ j ( z ) g j ( z ) 2 d z , SSE ( g ^ j ) = E g ^ j ( z ) g j ( z ) 2 d z 1 2 ,
MADE j = 1 200 i = 1 200 | g ^ j ( z j i ) g j ( z j i ) | and MADE = 1 p j = 1 p MADE j
for j = 1 , , p , where mathematical expectations are estimated by their corresponding empirical version, and the integrations are performed applying a Riemannian sum approximation at 200 fixed grid points { z j i } i = 1 200 that are equally-spaced chosen from [ a j , b j ] . From the model (1), the marginal effects are given by y x j = ( I n ρ W ) 1 I n α j for j = 1 , , q . According to LeSage and Pace [52] suggestions, the mean of either the row sums or the column sums of the non-diagonal elements is used as the indirect effects, the mean of the diagonal elements is used as the direct effects, and the sum of the indirect and direct effects is taken as the total effects.
To check the convergence of our algorithm, we run five Markov chains corresponding to different starting values through the MCMC sampling algorithm to perform each replication. The sampled traces of some parameters and nonparametric functions on grid points are displayed in Figure 1. It is obvious that the five parallel sequences mix quite well. We compute the “potential scale reduction factor” R ^ [60] for all unknown parameters and nonparametric functions at 20 selected grid points. Figure 2 shows all the values of R ^ against the iteration numbers. According to the suggestion of Gelman and Rubin [60], it is easy to see that 2000 burn-in iterations are enough to make the MCMC algorithm converge as all the values of R ^ were less than 1.2.
The boxplots of the Bias values are displayed in Figure 3. Under the Rook weight matrix, the medians of which are Bias 1 = 0.0147 and Bias 2 = 0.0104 for n = 100 , and Bias 1 = 0.0039 and Bias 2 = 0.0030 for n = 400 , respectively. Under the Case weight matrix, the medians of which are Bias 1 = 0.0148 and Bias 2 = 0.0104 for ( r , m ) = ( 20 , 5 ) , and Bias 1 = 0.0038 and Bias 2 = 0.0030 for ( r , m ) = ( 80 , 5 ) , respectively. Figure 3 show the boxplots of the SEE values. Under the Rook weight matrix, the medians are SSE 1 = 0.2490 and SSE 2 = 0.2271 for n = 100 , and SSE 1 = 0.1253 and SSE 2 = 0.1151 for n = 400 , respectively. Under the Case weight matrix, the medians are SSE 1 = 0.2495 and SSE 2 = 0.2267 for ( r , m ) = ( 20 , 5 ) , and SSE 1 = 0.1257 and SSE 2 = 0.1154 for ( r , m ) = ( 80 , 5 ) , respectively. The results show that the Bias values and the SEE values of the nonparametric functions decrease with the increase in the sample size, indicating that the nonparametric estimation is convergent. It is evident that the weight matrix of Case and Rook can obtain a reasonable estimation effect.
The estimation results are reported in Table 1. We observe that the mean values of all estimators are very close to the corresponding true values, and the mean value of SE is close to the respective SD. The results show that the parameter estimation and SE are precise. Meanwhile, the larger the sample sizes under the same weight matrix, the more precise the estimates are. The above experiences corresponding to different starting values have been repeated, and the results are similar. It implies that the MCMC sampling works well. Moreover, we find that the estimation effect of ρ with the Case weight matrix is slightly better than that with Rook weight matrix under the same sample sizes. The possible main reason is that the performance of the Case weight matrix is superior to the Rook weight matrix under different variances σ 2 . In addition, the general pattern of estimates reported in Table 1 is that all estimators impose a relatively bigger bias on the total effect when the same sample sizes have a strong positive spatial dependence. Figure 4 depicts the fitted functions, together with its 95% CI from a typical sample with ρ = 0.5 and σ 2 = 0.25 . The typical sample is selected in such a way that the SSE values are equal to the median in the 1000 replications. It is obvious that the fitted nonparametric functions are improving with increasing the sample size.
For comparison purposes, we use the Bayesian P-splines approach to approximate the nonparametric functions [61], where we assign a second-order random walk prior to the spline coefficients. The boxplots of MADE values with the Case weight matrix in Figure 5. In our method, the medians of MADE are MADE 1 = 0.0997 , MADE 2 = 0.0804 and MADE = 0.0916 for ( r , m ) = ( 20 , 5 ) , and the medians of MADE are MADE 1 = 0.0504 , MADE 2 = 0.0433 and MADE = 0.0476 for ( r , m ) = ( 80 , 5 ) , respectively, which are slightly smaller than the Bayesian P-splines approach. The results show that the Bayesian free knots splines approach is superior to the Bayesian P-splines approach in terms of fitting unknown nonparametric functions and computing time. Furthermore, we also compare the performance between the generalized method of moment estimator (GMME) in Du et al. [29] and the Bayesian MCMC estimator (BE) in our method. In order to evaluate the estimation effect of the nonparametric functions, we calculate the integrated squared bias (Bias) and the root integrated mean squared errors (SSE). Table 2 reports the results of the nonparameter estimation for GMME and BE (only a replication with ( ρ , σ 2 ) = ( 0.5 , 0.25 ) is displayed). It is evident that the estimates are improving with increasing the sample size, the Bias of the BE estimates are slightly smaller than the Bias of the GMME, and the SSE of the BE estimates are very smaller than Bias of the GMME under the same sample size, showing that BE is better than GMME, although the latter can also obtain a reasonable estimation.

4.2. Application

We use the proposed model and estimation methods to analyze the well-known Sydney real estate data. A detailed description of the data set can be found in Harezlak et al. [62]. The data set contains a total of 37,676 properties sold in the Sydney Statistical Division in the calendar year of 2001, which is available from the HRW package in R. We only focus on the last week of February to avoid the temporal issue, and there are 538 properties.
In this application, the house price (Price) is explained by four variables that include average weekly income (Income), levels of particulate matter with a diameter of less than a 10 micrometers level recorded at the air pollution monitoring station closest to the house (PM 10 ), lot size (LS), and distance from house to the nearest coastline location in kilometers (DC). On the one hand, Income and PM 10 have a linear effect on the response Price. On the other hand, LS and DC have a nonlinear effect on the response Price. Meanwhile, logarithmic transformation is performed on all variables to alleviate the trouble caused by large gaps in the domain. In addition, all variables are transformed such that the marginal distribution is approximately a standard normal distribution. This motivates us to consider the PLASAR model:
y i = ρ l = 1 n w i l y l + x i T α + g 1 ( z i 1 ) + g 2 ( z i 2 ) + ε i , i = 1 , , n ,
where the response variable y i = log ( Price i ) , x i 1 = log ( Income i ) , x i 2 = log ( PM 10 i ) , z i 1 = log ( LS i ) , z i 2 = log ( DC i ) . Regarding the choice of the weight matrix, we use the Euclidean distance between any two houses to calculate the spatial weight matrix [63]. The spatial weight w i l is
w i l = exp { s i s l } / k i exp { s i s k } ,
where s i = ( L o n i , L a t i ) is represented as the longitude and latitude of location. We apply a quadratic B-splines and assign hyperparameters ( λ , r 0 , s 0 2 , r τ α 0 , s τ α 0 2 , r τ β j 0 , s τ β j 0 2 ) = ( 2 , 1 , 1 , 1 , 0.005 , 1 , 0.005 ) for j = 1 , , p in our computation. We gradually decrease or increase the use of tuning parameters σ ρ and σ ξ j such that the acceptable rates for updating ρ and ( k j , ξ j ) are around 25% for j = 1 , , p .
We generate 10,000 sampled values following a burn-in of 10,000 iterations and run the proposed Gibbs sampler five times with different initial states in each replication. Figure 6 plot the traces of some unknown parameters and nonparametric functions on grid points. It is obvious that the five parallel Markov chains mix well. We further calculate the “potential scale reduction factor” R ^ for each of the unknown parameters and nonparametric functions on 20 selected grid points, which are plotted in Figure 7. The result indicates that the Markov chains have converged within the first 10,000 burn-in iterations.
Table 3 lists the estimated parameters together with their SE and 95% CI, which show that the estimation of ρ ^ is 0.5548 with SE = 0.0307 . It implies that there exists a significant and positive spatial relationship on the housing price. We find that two covaraites have significant effects on the housing price, and the effects of Income are positive, but PM 10 is negative. The regression coefficient of Income is α ^ 1 = 0.3269 > 0 , which indicates that the Income has a positive effect on the housing price. Moreover, the regression coefficient of PM 10 is α ^ 2 = 0.0810 < 0 , which reveals that the housing price would decrease as the PM 10 increases.
Figure 8 depicts the fitted functions, together with its 95% CI, which look like two nonlinear functions. The curves show that g 1 ( z 1 ) has a local maximum 0.6184 at around z 1 = 3.9198 and a local minimum 0.1557 at around z 1 = 0.8237 , and g 2 ( z 2 ) has a local minimum −0.8224 at around z 2 = 2.1605 . The results provide evidence that the significant effects of LS and DC on the housing price have a nonlinear S-shape and U-shape, respectively.

5. Summary

Spatial data are frequently encountered in practical applications and can be analyzed through the SAR model. To avoid some serious shortcomings of fully nonparametric models and reduce the high risk of misspecification of the traditional SAR models, we have considered PLASAR models for spatial data, which combine the PLA model and SAR model. In addition to spatial dependence between neighbors, it captures the linear and nonlinear effects between the related variables. We specify the prior of all unknown parameters, which led to a proper posterior distribution. The posterior summaries are obtained via the MCMC technique, and we have considered a fully Bayesian approach with free-knot splines to analyze the PLASAR model and design a Gibbs sampler to explore the full conditional posterior distributions. To obtain a rapidly-convergent algorithm, a modified Bayesian free-knot splines approach incorporated with powerful MCMC techniques is employed. We have illustrated that the finite sample of the proposed model and estimation method perform satisfactorily through a simulation study. The results show that the Bayesian estimator is efficient relative to the GMME, although the latter can also obtain reasonable estimations. Finally, the proposed model and methodology are applied to analyze real data.
This article focuses only on symmetrical data and the homoscedasticity of independent errors. Since spatial data cannot easily meet the conditions, it is fairly straightforward to analyze the proposed model and methodology to deal with spatial errors and heteroscedasticity. While we use PLASAR models to assess the linear and nonlinear effects of the covariates on the spatial response, the other models, such as partially linear single-index SAR models and partially linear varying-coefficient SAR models, can also be considered. Moreover, it would be interesting to develop a model selection method in which covariates are linear or nonlinear. We leave these topics for future research.

Author Contributions

Supervision, Z.C. and J.C.; software, Z.C.; methodology, Z.C.; writing—original draft preparation, Z.C.; writing—review and editing, Z.C. and J.C. Both authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Natural Science Foundation of China (12001105), the Natural Science Foundation of Fujian Province (2020J01170), the Postdoctoral Science Foundation of China (2019M660156), the Program for Probability and Statistics: Theory and Application (No. IRTL1704), and the Program for Innovative Research Team in Science and Technology in Fujian Province University (IRTSTFJ).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available in Reference [62].

Acknowledgments

The authors are most grateful to anonymous referees and the editors for their careful reading and insightful comments, who have helped to significantly improve this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cliff, A.D.; Ord, J.K. Spatial Autocorrelation; Pion Ltd.: London, UK, 1973. [Google Scholar]
  2. Anselin, L. Spatial Econometrics: Methods and Models; Kluwer Academic Publisher: Dordrecht, The Netherlands, 1988. [Google Scholar]
  3. Case, A.C. Spatial patterns in householed demand. Econometrica 1991, 59, 953–965. [Google Scholar] [CrossRef] [Green Version]
  4. Cressie, N. Statistics for Spatial Data; John Wiley and Sons: New York, NY, USA, 1993. [Google Scholar]
  5. LeSage, J.P. Bayesian estimation of spatial autoregressive models. Int. Reg. Sci. Rev. 1997, 20, 113–129. [Google Scholar] [CrossRef]
  6. LeSage, J.P. Bayesian estimation of limited dependent variable spatial autoregressive models. Geogr. Anal. 2000, 32, 19–35. [Google Scholar] [CrossRef]
  7. Anselin, L.; Bera, A.K. Spatial dependence in linear regression models with an introduction to spatial econometrics. In Handbook of Applied Economics Statistics; Marcel Dekker: New York, NY, USA, 1998. [Google Scholar]
  8. Ord, J. Estimation methods for models of spatial interaction. J. Am. Stat. Assoc. 1975, 70, 120–126. [Google Scholar] [CrossRef]
  9. Kelejian, H.H.; Prucha, I.R. A generalized moments estimator for the autoregressive parameter in a spatial model. Int. Econ. Rev. 1999, 40, 509–533. [Google Scholar] [CrossRef] [Green Version]
  10. Lee, L.F. Asymptotic distribution of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 2004, 72, 1899–1925. [Google Scholar] [CrossRef]
  11. Paelinck, J.H.P.; Klaassen, L.H.; Ancot, J.P.; Verster, A.C.P. Spatial Econometrics; Gower: Farnborough, UK, 1979. [Google Scholar]
  12. Basile, R.; Gress, B. Semi-parametric spatial auto-covariance models of regional growth behaviour in Europe. Rég. Dév. 2004, 21, 93–118. [Google Scholar] [CrossRef]
  13. Basile, R. Regional economic growth in Europe: A semiparametric spatial dependence approach. Pape. Reg. Sci. 2008, 87, 527–544. [Google Scholar] [CrossRef]
  14. Su, L.J.; Jin, S.N. Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. J. Econom. 2010, 157, 18–33. [Google Scholar] [CrossRef]
  15. Baltagi, B.H.; Li, D. LM tests for functional form and spatial correlation. Int. Reg. Sci. Rev. 2001, 24, 194–225. [Google Scholar] [CrossRef]
  16. Pace, P.K.; Barry, R.; Slawson, V.C., Jr.; Sirmans, C.F. Simultaneous spatial and functional form transformation. In Advances in Spatial Econometrics; Anselin, L., Florax, R., Rey, S.J., Eds.; Springer: Berlin, Germany, 2004; pp. 197–224. [Google Scholar]
  17. van Gastel, R.A.J.J.; Paelinck, J.H.P. Computation of Box-cox transform parameters: A new method and its application to spatial econometrics. In New Directions in Spatial Econometrics; Anselin, L., Florax, R.J.G.M., Eds.; Springer: Berlin/Heidelberg, Germany, 1995; pp. 136–155. [Google Scholar]
  18. Yang, Z.; Li, C.; Tse, Y.K. Functional form and spatial dependence in dynamic panels. Econ. Lett. 2006, 91, 138–145. [Google Scholar] [CrossRef]
  19. Bellman, R.E. Adaptive Control Processes; Princeton University Press: Princeton, NJ, USA, 1961. [Google Scholar]
  20. Friedman, J.H.; Stuetzle, W. Projection pursuit regression. J. Am. Stat. Assoc. 1981, 76, 817–823. [Google Scholar] [CrossRef]
  21. Engle, R.F.; Granger, C.W.; Rice, J.; Weiss, A. Semiparametric Estimates of the Relation Between Weather and Electricity Sales. J. Am. Stat. Assoc. 1986, 81, 310–320. [Google Scholar] [CrossRef]
  22. Hastie, T.J.; Tibshirani, R.J. Generalized Additive Models; Chapman and Hall: London, UK, 1990. [Google Scholar]
  23. Hastie, T.J.; Tibshirani, R.J. Varying-coefficient models. J. R. Stat. B 1993, 55, 757–796. [Google Scholar] [CrossRef]
  24. Su, L.J. Semiparametric GMM estimation of spatial autoregressive models. J. Econom. 2012, 167, 543–560. [Google Scholar] [CrossRef]
  25. Chen, J.Q.; Wang, R.F.; Huang, Y.X. Semiparametric spatial autoregressive model: A two-step Bayesian approach. Ann. Public Health Res. 2015, 2, 1012. [Google Scholar]
  26. Wei, H.J.; Sun, Y. Heteroskedasticity-robust semi-parametric GMM estimation of a spatial model with space-varying coefficients. Spat. Econ. Anal. 2016, 12, 113–128. [Google Scholar] [CrossRef]
  27. Krisztin, T. The determinants of regional freight transport: A spatial, semiparametric approach. Geogr. Anal. 2017, 49, 268–308. [Google Scholar] [CrossRef]
  28. Krisztin, T. Semi-parametric spatial autoregressive models in freight generation modeling. Transp. Res. Part E Logist. Transp. Rev. 2018, 114, 121–143. [Google Scholar] [CrossRef]
  29. Du, J.; Sun, X.Q.; Cao, R.Y.; Zhang, Z.Z. Statistical inference for partially linear additive spatial autoregressive models. Spat. Stat. 2018, 25, 52–67. [Google Scholar] [CrossRef]
  30. Chen, J.B.; Cheng, S.L. GMM estimation of a partially linear additive spatial error model. Mathematics 2021, 9, 622. [Google Scholar] [CrossRef]
  31. Liang, H.; Thurston, S.W.; Ruppert, D.; Apanasovich, T.; Hauser, R. Additive partial linear models with measurement errors. Biometrika 2008, 95, 667–678. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Deng, G.H.; Liang, H. Model averaging for semiparametric additive partial linear models. Sci. China Math. 2010, 53, 1363–1376. [Google Scholar] [CrossRef]
  33. Wang, L.; Yang, L.J. Spline-backfitted kernel smoothing of nonlinear additive autoregression model. Ann. Stat. 2007, 35, 2474–2503. [Google Scholar] [CrossRef] [Green Version]
  34. Zhang, J.; Lian, H. Partially linear additive models with Unknown Link Functions. Scand. J. Stat. 2018, 45, 255–282. [Google Scholar] [CrossRef]
  35. Hu, Y.A.; Zhao, K.F.; Lian, H. Bayesian Quantile Regression for Partially Linear Additive Models. Stat. Comput. 2015, 25, 651–668. [Google Scholar] [CrossRef] [Green Version]
  36. Lian, H. Semiparametric estimation of additive quantile regression models by twofold penalty. J. Bus. Econ. Stat. 2012, 30, 337–350. [Google Scholar] [CrossRef]
  37. Sherwood, B.; Wang, L. Partially linear additive quantile regression in ultra-high dimension. Ann. Stat. 2016, 44, 288–317. [Google Scholar] [CrossRef]
  38. Yu, K.M.; Lu, Z.D. Local linear additive quantile regression. Scand. J. Stat. 2004, 31, 333–346. [Google Scholar] [CrossRef]
  39. Du, P.; Cheng, G.; Liang, H. Semiparametric regression models with additive nonparametric components and high dimensional parametric components. Comput. Stat. Data Anal. 2012, 56, 2006–2017. [Google Scholar] [CrossRef]
  40. Guo, J.; Tang, M.L.; Tian, M.Z.; Zhu, K. Variable selection in high-dimensional partially linear additive models for composite quantile regression. Comput. Stat. Data Anal. 2013, 65, 56–67. [Google Scholar] [CrossRef]
  41. Liu, X.; Wang, L.; Liang, H. Estimation and variable selection for semiparametric additive partial linear models. Stat. Sin. 2011, 21, 1225–1248. [Google Scholar] [CrossRef] [Green Version]
  42. Wang, L.; Liu, X.; Liang, H.; Carroll, R.J. Estimation and variable selection for generalized additive partial linear models. Ann. Stat. 2011, 39, 1827–1851. [Google Scholar] [CrossRef] [PubMed]
  43. Denison, D.G.T.; Mallick, B.K.; Smith, A.F.M. Automatic Bayesian curving fitting. J. R. Stat. B 1998, 60, 333–350. [Google Scholar] [CrossRef]
  44. Dimatteo, I.; Genovese, C.R.; Kass, R.E. Bayesian curve fitting with free-knot splines. Biometrika 2001, 88, 1055–1071. [Google Scholar] [CrossRef]
  45. Holmes, C.C.; Mallick, B.K. Bayesian regression with multivariate linear splines. J. R. Stat. B 2001, 63, 3–17. [Google Scholar] [CrossRef]
  46. Holmes, C.C.; Mallick, B.K. Generalized nonlinear modeling with multivariate free-knot regression splines. J. Am. Stat. Assoc. 2003, 98, 352–368. [Google Scholar] [CrossRef]
  47. Lindstrom, M.J. Bayesian estimation of free-knot splines using reversible jump. Comput. Stat. Data Anal. 2002, 41, 255–269. [Google Scholar] [CrossRef]
  48. Poon, W.-Y.; Wang, H.-B. Bayesian analysis of generalized partially linear single-index models. Comput. Stat. Data Anal. 2013, 68, 251–261. [Google Scholar] [CrossRef]
  49. Poon, W.Y.; Wang, H.B. Multivariate partially linear single-index models: Bayesian analysis. J. Nonparametr. Stat. 2014, 26, 755–768. [Google Scholar] [CrossRef]
  50. Chen, Z.Y.; Wang, H.B. Latent single-index models for ordinal data. Stat. Comput. 2018, 28, 699–711. [Google Scholar] [CrossRef]
  51. Wang, H.B. A Bayesian multivariate partially linear single-index probit model for ordinal responses. J. Stat. Comput. Sim. 2018, 88, 1616–1636. [Google Scholar] [CrossRef]
  52. LeSage, P.J.; Pace, R.K. Introduction to Spatial Econometrics; CRC Press: Boca Raton, FL, USA; London, UK; New York, NY, USA, 2009. [Google Scholar]
  53. Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E. Equations of state calculations by fast computing machine. J. Chem. Phys. 1953, 21, 1087–1091. [Google Scholar] [CrossRef] [Green Version]
  54. Hastings, W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57, 97–109. [Google Scholar] [CrossRef]
  55. Tanner, M.A. Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions, 2nd ed.; Springer: New York, NY, USA, 1993. [Google Scholar]
  56. Panagiotelis, A.; Smith, M. Bayesian identification, selection and estimation of semiparametric functions in high-dimensional additive models. J. Econom. 2008, 143, 291–316. [Google Scholar] [CrossRef]
  57. Green, P. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 1995, 82, 711–732. [Google Scholar] [CrossRef]
  58. Chen, M.-H.; Schmeiser, B.W. General hit-and-run Monte Carlo sampling for evaluating multidimensional integrals. Oper. Res. Lett. 1996, 19, 161–169. [Google Scholar] [CrossRef]
  59. Su, L.J.; Yang, Z.L. Instrumental Variable Quantile Estimation of Spatial Autoregressive Models; Working Paper; Singapore Management University: Singapore, 2009. [Google Scholar]
  60. Gelman, A.; Rubin, D.B. Inference from iterative simulation using multiple sequences. Stat. Sci. 1992, 7, 457–511. [Google Scholar] [CrossRef]
  61. Chen, Z.Y.; Chen, M.H.; Xing, G.D. Bayesian Estimation of Partially Linear Additive Spatial Autoregressive Models with P-Splines. Math. Probl. Eng. 2021, 2021, 1777469. [Google Scholar] [CrossRef]
  62. Harezlak, J.; Ruppert, D.; Wand, M. Semiparametric Regression with R; Springer: New York, NY, USA, 2018. [Google Scholar]
  63. Sun, Y.; Yan, H.J.; Zhang, W.Y.; Lu, Z. A Semiparametric spatial dynamic model. Ann. Stat. 2014, 42, 700–727. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Trace plots of five parallel Markov chains with different starting values for some parameters and nonparametric functions (only a replication with ( r , m ) = ( 80 , 5 ) and ( ρ , σ 2 ) = ( 0.5 , 0.25 ) is displayed).
Figure 1. Trace plots of five parallel Markov chains with different starting values for some parameters and nonparametric functions (only a replication with ( r , m ) = ( 80 , 5 ) and ( ρ , σ 2 ) = ( 0.5 , 0.25 ) is displayed).
Symmetry 13 01635 g001
Figure 2. The “potential scale reduction factor” R ^ for simulation results (the case of ( ρ , σ 2 ) = ( 0.5 , 0.25 ) ).
Figure 2. The “potential scale reduction factor” R ^ for simulation results (the case of ( ρ , σ 2 ) = ( 0.5 , 0.25 ) ).
Symmetry 13 01635 g002
Figure 3. The boxplots (a,b) display the integrated squared bias, the boxplots (c,d) display the root integrated mean squared errors. (The two panels on the left are based on the Rook weight matrix, and the two panels on the right are based on the Case weight matrix with ( ρ , σ 2 ) = ( 0.5 , 0.25 ) ).
Figure 3. The boxplots (a,b) display the integrated squared bias, the boxplots (c,d) display the root integrated mean squared errors. (The two panels on the left are based on the Rook weight matrix, and the two panels on the right are based on the Case weight matrix with ( ρ , σ 2 ) = ( 0.5 , 0.25 ) ).
Symmetry 13 01635 g003
Figure 4. The true functions (solid lines), the fitted functions (dotted lines) and their 95% CI (dot-dashed lines) for a typical sample (the left panel based on the Rook weight matrix and the right panel based on the Case weight matrix with ( ρ , σ 2 ) = ( 0.5 , 0.25 ) ).
Figure 4. The true functions (solid lines), the fitted functions (dotted lines) and their 95% CI (dot-dashed lines) for a typical sample (the left panel based on the Rook weight matrix and the right panel based on the Case weight matrix with ( ρ , σ 2 ) = ( 0.5 , 0.25 ) ).
Symmetry 13 01635 g004
Figure 5. The boxplots (a) display the mean absolute deviation errors with the Case weight matrix ( r , m ) = ( 20 , 5 ) and (b) display the mean absolute deviation errors with the Case weight matrix ( r , m ) = ( 80 , 5 ) (the three panels on the left are based on Bayesian free knots splines and the three panels on the right are based on Bayesian P-splines).
Figure 5. The boxplots (a) display the mean absolute deviation errors with the Case weight matrix ( r , m ) = ( 20 , 5 ) and (b) display the mean absolute deviation errors with the Case weight matrix ( r , m ) = ( 80 , 5 ) (the three panels on the left are based on Bayesian free knots splines and the three panels on the right are based on Bayesian P-splines).
Symmetry 13 01635 g005
Figure 6. Trace plots of five parallel Markov chains with different starting values for some parameters and nonparametric functions.
Figure 6. Trace plots of five parallel Markov chains with different starting values for some parameters and nonparametric functions.
Symmetry 13 01635 g006
Figure 7. The “potential scale reduction factor” R ^ for Sydney real estate data.
Figure 7. The “potential scale reduction factor” R ^ for Sydney real estate data.
Symmetry 13 01635 g007
Figure 8. The fitted functions (dotted lines) and their 95% CI (dot-dashed lines) in the model (17) for Sydney real estate data.
Figure 8. The fitted functions (dotted lines) and their 95% CI (dot-dashed lines) in the model (17) for Sydney real estate data.
Symmetry 13 01635 g008
Table 1. Simulation results of parametric estimation.
Table 1. Simulation results of parametric estimation.
ParameternRook Weight Matrix ( r , m ) Case Weight Matrix
MeanSESD 95 % CIMeanSESD 95 % CI
ρ = 0.2000 100 0.2018 0.0552 0.0492 ( 0.0929 , 0.3099 ) (20, 5) 0.1987 0.0512 0.0473 ( 0.0973 , 0.2990 )
α 1 = 1.0000 0.9930 0.0649 0.0623 ( 0.8655 , 1.1203 ) 0.9930 0.0648 0.0620 ( 0.8656 , 1.1202 )
α 2 = 1.0000 0.9956 0.0650 0.0633 ( 1.1231 , 0.8680 ) 0.9955 0.0650 0.0632 ( 1.1202 , 0.8679 )
σ 2 = 0.2500 0.2716 0.0447 0.0381 ( 0.1980 , 0.3723 ) 0.2714 0.0447 0.0382 ( 0.1979 , 0.3720 )
Total effect
x 1 = 1.2500 1.2544 0.1188 0.1064 ( 1.0378 , 1.5042 ) 1.2487 0.1134 0.1083 ( 1.0340 , 1.4853 )
x 2 = 1.2500 1.2581 0.1189 0.1115 ( 1.5083 , 1.0414 ) 1.2516 0.1136 0.1057 ( 1.4886 , 1.0424 )
ρ = 0.5000 0.5000 0.0462 0.0414 ( 0.4081 , 0.5896 ) 0.4986 0.0344 0.0318 ( 0.4304 , 0.5655 )
α 1 = 1.0000 0.9930 0.0652 0.0627 ( 0.8650 , 1.1210 ) 0.9931 0.0652 0.0620 ( 0.8651 , 1.1211 )
α 2 = 1.0000 0.9957 0.0653 0.0633 ( 1.1239 , 0.8674 ) 0.9957 0.0653 0.0635 ( 1.1239 , 0.8676 )
σ 2 = 0.2500 0.2719 0.0450 0.0383 ( 0.1980 , 0.3731 ) 0.2717 0.0453 0.0384 ( 0.1978 , 0.3728 )
Total effect
x 1 = 2.0000 2.0153 0.2283 0.1971 ( 1.6229 , 2.4969 ) 1.9972 0.1818 0.1734 ( 1.6633 , 2.3758 )
x 2 = 2.0000 2.0215 0.2311 0.2053 ( 2.5037 , 1.6281 ) 2.0018 0.1821 0.1687 ( 2.3817 , 1.6673 )
ρ = 0.7000 0.6994 0.0348 0.0319 ( 0.6300 , 0.7667 ) 0.6988 0.0217 0.0200 ( 0.6557 , 0.7409 )
α 1 = 1.0000 0.9932 0.0655 0.0629 ( 0.8648 , 1.1219 ) 0.9934 0.0656 0.0621 ( 0.8648 , 1.1222 )
α 2 = 1.0000 0.9959 0.0655 0.0635 ( 1.1245 , 0.8673 ) 0.9934 0.0656 0.0639 ( 1.1249 , 0.8671 )
σ 2 = 0.2500 0.2721 0.0451 0.0385 ( 0.1981 , 0.3738 ) 0.2718 0.0451 0.0386 ( 0.1977 , 0.3736 )
Total effect
x 1 = 3.3333 3.3823 0.4428 0.3911 ( 2.6360 , 4.3697 ) 3.3273 0.3034 0.2895 ( 2.7702 , 3.9615 )
x 2 = 3.3333 3.3924 0.4434 0.4034 ( 4.3811 , 2.6451 ) 3.3346 0.3037 0.2810 ( 3.9697 , 2.7772 )
ρ = 0.2000 100 0.1982 0.0766 0.0750 ( 0.0469 , 0.3471 ) (20, 5) 0.1914 0.0713 0.0738 ( 0.0506 , 0.3293 )
α 1 = 1.0000 0.9839 0.1108 0.1075 ( 0.7663 , 1.2011 ) 0.9840 0.1107 0.1071 ( 0.7665 , 1.2013 )
α 2 = 1.0000 0.9882 0.1108 0.1095 ( 1.2059 , 0.7708 ) 0.9878 0.1108 0.1093 ( 1.2050 , 0.7701 )
σ 2 = 0.7500 0.8234 0.1385 0.1386 ( 0.5957 , 1.1359 ) 0.8230 0.1393 0.1391 ( 0.5955 , 1.1353 )
Total effect
x 1 = 1.2500 1.2487 0.1858 0.1743 ( 0.9164 , 1.6442 ) 1.2369 0.1767 0.1771 ( 0.9169 , 1.6083 )
x 2 = 1.2500 1.2552 0.1865 0.1841 ( 1.6525 , 0.9221 ) 1.2408 0.1771 0.1734 ( 1.6130 , 0.9189 )
ρ = 0.5000 0.4947 0.0637 0.0630 ( 0.3676 , 0.6170 ) 0.4934 0.0480 0.0498 ( 0.3975 , 0.5853 )
α 1 = 1.0000 0.9847 0.1112 0.1082 ( 0.7667 , 1.2031 ) 0.9849 0.1112 0.1069 ( 0.7667 , 1.2033 )
α 2 = 1.0000 0.9890 0.1112 0.1095 ( 1.2073 , 0.7707 ) 0.9887 0.1113 0.1097 ( 1.2070 , 0.7703 )
σ 2 = 0.7500 0.8250 0.1398 0.1402 ( 0.5955 , 1.1404 ) 0.8248 0.1404 0.1407 ( 0.5951 , 1.1405 )
Total effect
x 1 = 2.0000 2.0079 0.3449 0.3154 ( 1.4275 , 2.7545 ) 1.9782 0.2826 0.2831 ( 1.4639 , 2.5733 )
x 2 = 2.0000 2.0187 0.3449 0.3321 ( 2.7676 , 1.4365 ) 1.9844 0.2830 0.2778 ( 2.5805 , 1.4696 )
ρ = 0.7000 0.6942 0.0479 0.0477 ( 0.5984 , 0.7853 ) 0.6955 0.0301 0.0314 ( 0.6351 , 0.7528 )
α 1 = 1.0000 0.9854 0.1115 0.1086 ( 0.7670 , 1.2044 ) 0.9857 0.1116 0.1074 ( 0.7673 , 1.2052 )
α 2 = 1.0000 0.9896 0.1116 0.1099 ( 1.2090 , 0.7708 ) 0.9857 0.1117 0.1104 ( 1.2087 , 0.7705 )
σ 2 = 0.7500 0.8268 0.1417 0.1417 ( 0.5959 , 1.1454 ) 0.8263 0.1420 0.1418 ( 0.5949 , 1.1455 )
Total effect
x 1 = 3.3333 3.3768 0.6705 0.6079 ( 2.3087 , 4.8873 ) 3.2961 0.4704 0.4722 ( 2.4414 , 4.2873 )
x 2 = 3.3333 3.3950 0.6743 0.6351 ( 4.9107 , 2.3229 ) 3.3061 0.4716 0.4637 ( 4.2986 , 2.4486 )
ρ = 0.2000 400 0.1996 0.0245 0.0235 ( 0.1514 , 0.2475 ) ( 80 , 5 ) 0.2006 0.0218 0.0208 ( 0.1577 , 0.2431 )
α 1 = 1.0000 1.0005 0.0297 0.0286 ( 0.9422 , 1.0587 ) 1.0005 0.0297 0.0287 ( 0.9423 , 1.0587 )
α 2 = 1.0000 0.9972 0.0297 0.0290 ( 1.0554 , 0.9389 ) 0.9972 0.0297 0.0291 ( 1.0554 , 0.9390 )
σ 2 = 0.2500 0.2549 0.0186 0.0185 ( 0.2210 , 0.2939 ) 0.2549 0.0186 0.0185 ( 0.2210 , 0.2939 )
Total effect
x 1 = 1.2500 1.2521 0.0524 0.0503 ( 1.1525 , 1.3578 ) 1.2532 0.0496 0.0477 ( 1.1584 , 1.3527 )
x 2 = 1.2500 1.2480 0.0524 0.0514 ( 1.3539 , 1.1484 ) 1.2490 0.0494 0.0480 ( 1.3484 , 1.1546 )
ρ = 0.5000 0.4995 0.0207 0.0198 ( 0.4587 , 0.5399 ) 0.5003 0.0148 0.0140 ( 0.4712 , 0.5290 )
α 1 = 1.0000 1.0006 0.0299 0.0287 ( 0.9420 , 1.0590 ) 1.0005 0.0299 0.0288 ( 0.9419 , 1.0590 )
α 2 = 1.0000 0.9972 0.0299 0.0287 ( 1.0558 , 0.9387 ) 0.9972 0.0299 0.0292 ( 1.0557 , 0.9387 )
σ 2 = 0.2500 0.2550 0.0187 0.0186 ( 0.2210 , 0.2941 ) 0.2549 0.0187 0.0186 ( 0.2209 , 0.2941 )
Total effect
x 1 = 2.0000 2.0051 0.0974 0.0932 ( 1.8227 , 2.2045 ) 2.0049 0.0797 0.0765 ( 1.8526 , 2.1651 )
x 2 = 2.0000 1.9984 0.0973 0.0949 ( 2.1977 , 1.8162 ) 1.9983 0.0795 0.0772 ( 2.1580 , 1.8465 )
ρ = 0.7000 0.6996 0.0158 0.0151 ( 0.6684 , 0.7304 ) 0.7001 0.0093 0.0087 ( 0.6817 , 0.7183 )
α 1 = 1.0000 1.0006 0.0300 0.0288 ( 0.9419 , 1.0594 ) 1.0004 0.0300 0.0289 ( 0.9416 , 1.0593 )
α 2 = 1.0000 0.9972 0.0300 0.0291 ( 1.0560 , 0.9385 ) 0.9971 0.0300 0.0293 ( 1.0560 , 0.9384 )
σ 2 = 0.2500 0.2550 0.0187 0.0183 ( 0.2210 , 0.2942 ) 0.2549 0.0188 0.0187 ( 0.2208 , 0.2942 )
Total effect
x 1 = 3.3333 3.3475 0.1920 0.1833 ( 2.9942 , 3.7466 ) 3.3415 0.1335 0.1272 ( 3.0867 , 3.6098 )
x 2 = 3.3333 3.3364 0.1917 0.1862 ( 3.7347 , 2.9837 ) 3.3304 0.1332 0.1283 ( 3.5980 , 3.0764 )
ρ = 0.2000 400 0.1986 0.0366 0.0365 ( 0.1266 , 0.2701 ) ( 80 , 5 ) 0.1994 0.0326 0.0323 ( 0.1354 , 0.2630 )
α 1 = 1.0000 0.9999 0.0512 0.0493 ( 0.8996 , 1.1001 ) 0.9999 0.0512 0.0495 ( 0.8997 , 1.1001 )
α 2 = 1.0000 0.9942 0.0512 0.0501 ( 1.0945 , 0.8940 ) 0.9943 0.0512 0.0501 ( 1.0945 , 0.8942 )
σ 2 = 0.7500 0.7597 0.0555 0.0558 ( 0.6587 , 0.8760 ) 0.7596 0.0555 0.0560 ( 0.6586 , 0.8757 )
Total effect
x 1 = 1.2500 1.2527 0.0847 0.0827 ( 1.0936 , 1.4257 ) 1.2530 0.0807 0.0786 ( 1.1000 , 1.4165 )
x 2 = 1.2500 1.2457 0.0847 0.0843 ( 1.4187 , 1.0866 ) 1.2459 0.0805 0.0800 ( 1.4090 , 1.0936 )
ρ = 0.5000 0.4982 0.0306 0.0306 ( 0.4376 , 0.5578 ) 0.4995 0.0219 0.0216 ( 0.4563 , 0.5420 )
α 1 = 1.0000 1.0001 0.0514 0.0494 ( 0.8994 , 1.1007 ) 1.0000 0.0514 0.0497 ( 0.8994 , 1.1007 )
α 2 = 1.0000 0.9944 0.0513 0.0501 ( 1.0950 , 0.8939 ) 0.9943 0.0514 0.0502 ( 1.0951 , 0.8938 )
σ 2 = 0.7500 0.7600 0.0559 0.0564 ( 0.6585 , 0.8770 ) 0.7598 0.0559 0.0567 ( 0.6582 , 0.8769 )
Total effect
x 1 = 2.0000 2.0068 0.1541 0.1509 ( 1.7231 , 2.3272 ) 2.0048 0.1293 0.1256 ( 1.7601 , 2.2668 )
x 2 = 2.0000 1.9956 0.1538 0.1538 ( 2.3151 , 1.7121 ) 1.9935 0.1289 0.1280 ( 2.2544 , 1.7496 )
ρ = 0.7000 0.6984 0.0232 0.0234 ( 0.6524 , 0.7435 ) 0.6996 0.0137 0.0136 ( 0.6725 , 0.7262 )
α 1 = 1.0000 1.0002 0.0515 0.0496 ( 0.8993 , 1.1012 ) 1.0001 0.0516 0.0499 ( 0.8991 , 1.1013 )
α 2 = 1.0000 0.9945 0.0515 0.0502 ( 1.0956 , 0.8937 ) 0.9944 0.0516 0.0504 ( 1.0955 , 0.8936 )
σ 2 = 0.7500 0.7603 0.0563 0.0569 ( 0.6581 , 0.8782 ) 0.7600 0.0564 0.0573 ( 0.6575 , 0.8781 )
Total effect
x 1 = 3.3333 3.3537 0.2978 0.2935 ( 2.8195 , 3.9858 ) 3.3415 0.2155 0.2091 ( 2.9334 , 3.7779 )
x 2 = 3.3333 3.3351 0.2970 0.2967 ( 3.9661 , 2.8023 ) 3.3226 0.2150 0.2134 ( 3.7580 , 2.9153 )
Table 2. Simulation results of the nonparametric estimation.
Table 2. Simulation results of the nonparametric estimation.
Functions ( r , m ) With GMMEWith BE
BiasSSEBiasSSE
g 1 ( · ) (60, 5) 0.014 7.947 0.0060.146
g 2 ( · ) 0.009 7.246 0.0070.141
g 1 ( · ) (80, 5) 0.011 6.860 0.004 0.131
g 2 ( · ) 0.004 6.235 0.001 0.117
Table 3. Parametric estimation in the model (17) for Sydney real estate data.
Table 3. Parametric estimation in the model (17) for Sydney real estate data.
ParameterMeanSE95% CI
ρ 0.5548 0.0307 ( 0.4932 , 0.6160 )
α 1 0.3269 0.0326 ( 0.2630 , 0.3908 )
α 2 0.0810 0.0318 ( 0.1433 , 0.0186 )
σ 2 0.3269 0.0326 ( 0.2630 , 0.3908 )
Total effect
x 1 0.7343 0.0875 ( 0.5573 , 0.9073 )
x 2 0.1819 0.0866 ( 0.3531 , 0.0087 )
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Chen, Z.; Chen, J. Bayesian Analysis of Partially Linear Additive Spatial Autoregressive Models with Free-Knot Splines. Symmetry 2021, 13, 1635. https://doi.org/10.3390/sym13091635

AMA Style

Chen Z, Chen J. Bayesian Analysis of Partially Linear Additive Spatial Autoregressive Models with Free-Knot Splines. Symmetry. 2021; 13(9):1635. https://doi.org/10.3390/sym13091635

Chicago/Turabian Style

Chen, Zhiyong, and Jianbao Chen. 2021. "Bayesian Analysis of Partially Linear Additive Spatial Autoregressive Models with Free-Knot Splines" Symmetry 13, no. 9: 1635. https://doi.org/10.3390/sym13091635

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop