Spatial Durbin Model with Expansion Using Casetti’s Approach: A Case Study for Rainfall Prediction in Java Island, Indonesia

Andriyana, Yudhie; Falah, Annisa Nur; Ruchjana, Budi Nurani; Sulaiman, Albertus; Hermawan, Eddy; Harjana, Teguh; Lim-Polestico, Daisy Lou

doi:10.3390/math12152304

Open AccessArticle

Spatial Durbin Model with Expansion Using Casetti’s Approach: A Case Study for Rainfall Prediction in Java Island, Indonesia

by

Yudhie Andriyana

^1,*

,

Annisa Nur Falah

²

,

Budi Nurani Ruchjana

³

,

Albertus Sulaiman

⁴

,

Eddy Hermawan

⁴,

Teguh Harjana

⁴

and

Daisy Lou Lim-Polestico

⁵

¹

Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran, Sumedang 45363, Indonesia

²

Post Doctoral Program, Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran, Sumedang 45363, Indonesia

³

Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran, Sumedang 45363, Indonesia

⁴

Research Center for Climate and Atmosphere, National Research and Innovation Agency (BRIN), Jakarta Pusat 10340, Indonesia

⁵

Center for Computational Analytics and Modeling, Premier Research Institute of Science and Mathematics, MSU-Iligan Institute of Technology, Iligan City 9200, Philippines

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(15), 2304; https://doi.org/10.3390/math12152304

Submission received: 13 June 2024 / Revised: 4 July 2024 / Accepted: 5 July 2024 / Published: 23 July 2024

Download

Browse Figures

Versions Notes

Abstract

Research on rainfall is critically important due to its significant impact on climate change and natural disasters in Indonesia. Various factors influence rainfall variability. Consequently, when examining spatial aspects, it is likely that spatial dependency exists not only in the response variable but also in the exogenous variables. Hence, a model that accounts for spatial dependencies between these variables is required. The integration of the Spatial Durbin Model (SDM) with Casetti’s expansion approach can be utilized to predict spatial patterns of rainfall influenced by exogenous variables. By incorporating spatial effects and relevant independent variables, this model can provide more precise estimates of rainfall distribution across different regions. This modeling technique is particularly effective for accurate rainfall prediction, considering exogenous factors such as air temperature, humidity, solar irradiation, and surface pressure. The SDM with Casetti’s expansion approach was employed to predict rainfall patterns in the Java Island region, utilizing data from the National Aeronautics and Space Administration’s Prediction of Worldwide Energy Resources (NASA POWER) big data website. The application of this model in the context of rainfall prediction highlights its importance in enhancing the understanding of weather dynamics and aiding disaster risk mitigation in Java Island, a highly populated region characterized by a Monsoon rainfall pattern. The rainfall prediction follows a Knowledge Discovery in Databases (KDD) methodology. The results of this study are expected to be valuable to relevant agencies, such as the Meteorology, Climatology, and Geophysics Agency (BMKG), and agribusiness companies, improving agricultural planning and planting seasons. Additionally, the general public can benefit from more accurate climate information, particularly regarding rainfall. The computational framework is developed within an RShiny web application, and the performance of the proposed technique is measured by the Mean Absolute Percentage Error (MAPE), achieving a very accurate prediction rate of 2.78%.

Keywords:

Spatial Durbin Model; expansion; rainfall; knowledge discovery in databases

MSC:

62-08; 62H11; 62P12

1. Introduction

Indonesia, a tropical archipelago, experiences high rainfall throughout the year. This is primarily due to its location around the equator and its geographical position between the Asian and Australian continents, as well as the Pacific and Indian Oceans [1]. Research on rainfall is critically important because this phenomenon significantly affects various aspects of human life and the environment, including agriculture, public health, water resource management, climate change, and natural disasters. Extreme rainfall can lead to natural disasters such as floods, landslides, and flash floods [2]. In October 2023, 41.16% of Indonesia experienced low-category rainfall, 49.05% experienced medium-category rainfall, and 9.79% experienced high- to very high-category rainfall. Out of 4879 observation points, 21 stations recorded extreme rainfall exceeding 150 mm/day [3]. Indonesia exhibits three distinct rainfall patterns: monsoon, equatorial, and local [4,5,6,7]. The Meteorology, Climatology, and Geophysics Agency (BMKG) of Indonesia categorizes the country’s regions based on these rainfall patterns, as shown in Figure 1 [8]. The monsoonal pattern is primarily influenced by monsoon circulation, which changes direction every six months, resulting in one peak during the rainy season and another during the dry season. The monsoonal rainfall pattern, characterized by a unimodal distribution (one peak), brings the rainy season from December to February (DJF) and the dry season from June to August (JJA), with a single peak during the rainy season. Regions with monsoonal rainfall patterns include eastern and southern Sumatra, Java, Bali, the Nusa Tenggara islands, southern Kalimantan, the west coast of South Sulawesi, the west coast of Southeast Sulawesi, the Buton and Muna islands, northern Sulawesi, southern Maluku, and the northern coast of Papua, including Merauke. On the other hand, areas with the equatorial rainfall pattern experience a bimodal distribution, with two peaks of maximum rainfall, and almost the entire year falls under the rainy season criteria. The equatorial pattern exhibits a semi-monsoonal rhythm with two distinct peaks in the rainy season, typically occurring around March and October. Regions characterized by equatorial rainfall patterns include western Sumatra, northern Kalimantan, parts of Central Sulawesi and South Sulawesi (Luwu Raya and Toraja regions), and central Papua. Conversely, the local pattern contrasts with the monsoonal type by featuring a unimodal rainfall pattern, but with characteristics distinct from monsoonal rainfall. Areas exhibiting local rainfall patterns include Parigi Moutong, Palu, Luwuk Banggai, the Banggai Islands, Taliabu, Sula, southern Buru, southern Seram, Ambon, Sorong, Raja Ampat, Teluk Bintuni, Fak-fak, and the eastern coast of South Sulawesi. Indonesia experiences relatively high average rainfall, ranging from 2000 to 3000 mm per year, making it an important subject for study. This research focuses on Java Island, which exhibits a monsoonal rainfall pattern.

Understanding rainfall patterns is crucial for guiding strategic planning and enhancing resilience across relevant sectors or agencies to mitigate climate variability and natural disasters. These patterns are integral to achieving Sustainable Development Goal (SDG) 13, which emphasizes ‘Climate Action’ and calls for urgent measures to mitigate and adapt to climate change [9]. A thorough understanding and effective management of rainfall patterns are essential for advancing SDG 13 objectives. This includes addressing the impacts of changing rainfall patterns on various sectors, strengthening resilience against extreme weather events, and promoting sustainable practices in water and land management.

Spatial modeling typically involves the development of mathematical or computational frameworks to depict spatial relationships or patterns within data [10]. These models are extensively employed across various disciplines to analyze spatial dimensions and phenomena [11]. Literature has shown that the application of spatial models is instrumental in the prediction of rainfall and climate dynamics. Over recent decades, spatial modeling has advanced through diverse methodological approaches. For instance, Triyatno et al. (2023) utilized a hybrid spatial model to assess flood hazards in the Tarusan Watershed, projecting significant changes from 2019 to 2039 [12]. In another study, Bonsoms et al. (2023) compared the performance and accuracy of traditional linear methods (Multiple Linear Regression and Generalized Additive Models) with five machine learning techniques (K-Nearest Neighbors, Support Vector Machines, Neural Networks, Stochastic Gradient Boosting, and Random Forest) for spatial interpolation of climatological variables in a mountainous region [13]. Anna et al. (2021) developed a spatial model using the rational modification method to mitigate local flooding hazards in Surakarta, Indonesia [14]. Jalbert et al. (2022) applied a Bayesian hierarchical model to interpolate precipitation extremes parameters across a large spatial domain, facilitating IDF curve construction at unmonitored locations [15]. Falah et al. (2023) employed a hybrid Spatial Autoregressive Exogenous (SAR-X) model approach, integrating Casetti’s method, to predict rainfall patterns in West Java, Indonesia [16].

Rainfall interacts with various climate variables such as air temperature, humidity, solar irradiation, and surface pressure, all of which exhibit regional variability [17]. In the context of spatial modeling, the heterogeneity in the SAR-X model is used to describe different parameter values for each spatial observation through the distance between locations [16]. However, a limitation of the SAR-X model is its focus solely on spatial dependence of the response variable, neglecting potential spatial relationships among exogenous variables. In the context of spatial modeling, it is crucial to consider spatial dependencies across both response and exogenous variables. To address this issue, the Spatial Durbin Model (SDM) is proposed in this study; this model is specifically designed to account for spatial interdependencies among both response and exogenous variables [18]. Furthermore, the SDM is enhanced through the expansion using the Casetti approach, which refines parameter estimation across spatial observations [19]. This dual modeling approach aims to provide a more comprehensive understanding of spatial dependencies in the context of rainfall prediction and climate modeling.

By integrating spatial effects and relevant independent variables, the model enhances the precision of rainfall distribution estimates across different regions. The Spatial Durbin Model (SDM) with expansion using Casetti’s approach offers several advantages. Firstly, it facilitates the identification of spatial rainfall patterns, capturing spatial dependencies among observation units within the region. Secondly, through the incorporation of pertinent exogenous variables, the model improves the accuracy of rainfall predictions. Thirdly, the results from the model can support disaster mitigation, water resources management, and resilient infrastructure development against natural disasters. This study explores the application of the SDM with expansion using Casetti’s approach in rainfall prediction, highlighting its significance in advancing understanding of weather dynamics and bolstering disaster risk mitigation, particularly in densely populated regions like Java Island. The research employs an integrated R script developed for the SDM with expansion using Casetti’s approach, implemented through the RShiny version 1.7.4 web application to optimize the prediction process.

2. Materials and Methods

2.1. Inverse Distance Weight Matrix

A matrix that expresses the proximity relationship between locations is called a spatial weight matrix. In this study, the distance weight, also known as the inverse distance weight matrix, is used to represent the true distance between sites. The latitude and longitude of the observed location’s center point are used to determine the distance between two locations. We denote well-defined location coordinates as

x_{i j} (u_{i j}, v_{i j})

for location i and j, with

i, j = 1,2, 3 \dots, N

locations, while u_ij and v_ij indicate the latitude and longitude coordinates of the ith and jth location. The Euclidean distance is used to measure,

d_{i j}

, the distance between location i and location

j

[20], as follows:

d_{i j} = \sqrt{{(x_{i} (u_{i}) - x_{j} (u_{j}))}^{2} + {(x_{i} (v_{i}) - x_{j} (v_{j}))}^{2}},

(1)

where

$u_{i}$ is the latitude of the $i th$ location, $i = 1, 2, 3, …, N$ ,
$v_{i}$ is the longitude $i th$ location,
$u_{j}$ is the latitude of the jth location, j = 1, 2, 3, …, N, and
$v_{j}$ is the longitude of the $j th$ location.

The Euclidean distance calculations are converted to kilometers using a conversion factor of 111.319, that is,

|d_{i j}| \times 111.319

. This conversion factor corresponds to the distance represented by one degree of longitude [21]. The actual distance between observation locations is then weighted as the inverse distance, computed using the following equation [22]:

w_{i j} = \{\begin{matrix} \frac{1}{d_{i j}}, & i \neq j \\ 0, & i = j \end{matrix} .

(2)

Next, if the sum of the distance weights of a row in the inverse distance weight matrix is not equal to 1 then the distance weights must be standardized [23] to obtain

\sum_{j = 1}^{N} w_{i j} = 1, \forall i = 1, 2, 3, \dots, N,

where:

w_{i j}^{*} = \frac{w_{i j}}{\sum_{j = 1}^{N} w_{i j}}, \forall i = 1,2, 3, \dots, N .

(3)

2.2. Moran Index

One approach to determine the presence of spatial dependencies and check the spatial relationship or correlation between locations is to conduct a spatial autocorrelation test using the Moran Index statistic. Spatial autocorrelation is an estimate of the correlation between observation values associated with locations on the same variable. If there is a systematic pattern in the distribution of a variable, then there is spatial autocorrelation. Spatial autocorrelation explains the dependency of spatial data between one location and another based on a measure of proximity or intersection [18].

I = \frac{n \sum_{i = 1}^{n} \sum_{j = 1}^{n} w_{i j} (x_{i} - \bar{x}) (x_{j} - \bar{x})}{\sum_{i = 1}^{n} \sum_{j = 1}^{n} w_{i j} \sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} .

(4)

The hypothesis formulation in this test is as follows:

$H_{0} : I = I_{0}$ There is no spatial autocorrelation between locations, vs.
$H_{1} : I \neq I_{0}$ There is spatial autocorrelation between locations,
With $I_{0} = E (I)$ .

The test statistic employed is expressed below:

Z (I) = \frac{I - E (I)}{\sqrt{V a r (I)}} \sim N (0, 1),

(5)

with

E (I) = - \frac{1}{n - 1},

(6)

V a r (I) = \frac{n^{2} S_{1} - n S_{2} + 3 S_{0}^{2}}{(n^{2} - 1) S_{0}^{2}} - {[E (I)]}^{2},

(7)

S_{0} = \sum_{i = 1}^{n} \sum_{j = 1}^{n} w_{i j}, S_{1} = \frac{1}{2} \sum_{1 \neq j}^{n} {(w_{i j} + w_{j i})}^{2}, S_{2} = {\sum_{i \neq j}^{n} (\sum_{i = 1}^{n} w_{i j} + \sum_{j = 1}^{n} w_{j i})}^{2}

where

$I$ is the Moran Index value,
$n$ is the number of observation locations,
$x_{i}$ is the value of the observation variable at the ith location,
$x_{j}$ is the value of the observation variable at jth location,
$\bar{x}$ is the average of the number of variables, and
$w_{i j}$ is an element of the standardized weight matrix between locations $i$ and $j$ .

The Moran Index ranges between [−1, 1]. A negative value indicates negative spatial autocorrelation, while a positive value signifies positive spatial autocorrelation [24]. We observe from Equation (6) that as the number of spatial locations increases, the expected value E(I) = E₀ moves closer to zero, indicating that the expected level of spatial autocorrelation under the null hypothesis (no spatial autocorrelation) diminishes as the number of observation locations increases. On the other hand, it becomes more negative with fewer spatial observation locations, indicating a greater expected negative spatial autocorrelation under the null hypothesis.

Decision Rule:

The null hypothesis is rejected based on the following decision rule:

Reject H_{0} at the significance level α if - Z_{s c o r e} \leq - Z_{\frac{α}{2}} or Z_{s c o r e} \geq Z_{\frac{α}{2}} .

2.3. Spatial Durbin Model

The Spatial Durbin Model (SDM) is a spatial regression model similar to the Spatial Autoregressive (SAR) model, featuring a spatial lag in the response variable. However, unlike the SAR model, the SDM is distinguished by incorporating a spatial lag in the exogenous variables [18]. According to [25], the SDM is described as follows

y = ρ W y + α 1_{n} + \tilde{X} β + W \tilde{X} θ + ε with ε \overset{i i d}{\sim} N (0, σ^{2} I),

(8)

or it can be written as follows:

y_{i} = ρ \sum_{j = 1}^{n} w_{i j} y_{j} + α + \sum_{k = 1}^{p} β_{k} x_{i k} + \sum_{k = 1}^{p} \sum_{j = 1}^{n} θ_{k} (w_{i j} x_{k j}) + ε_{i}, y = ρ W y + P δ + ε,

(9)

where

$y$ is the vector of dependent variables of size (n × 1),
$\tilde{X}$ is the matrix of independent variables of size (n × k),
$P$ = $[\begin{matrix} 1_{n} & \tilde{X} & W \tilde{X} \end{matrix}]$ ,
$δ$ = $[\begin{matrix} α \\ β \\ θ \end{matrix}]$ ,
$ρ$ is the spatial lag coefficient of the dependent variable,
$α$ is the constant parameter,
$β$ is the vector of regression parameters of size (k × 1),
$θ$ is the spatial lag parameter vector of covariate variable of size (k × 1),
$W$ is the spatial weight matrix of size (n × n), and
$ε$ is the error vector of size (n × 1).

2.4. Spatial Expansion with Casetti’s Approach

The Spatial Expansion Model (Casetti, 1972) [19] introduced in this study addresses spatial heterogeneity by accounting for varying parameter values across spatial observations, determined by the Euclidean distance between locations based on their coordinates. The spatial expansion using Casetti’s approach, adopting a linear regression approach, is formulated as follows:

y = X β + ε with ε \overset{i i d}{\sim} N (0, σ^{2} I), β = Z J β_{0},

(10)

where

$y$ is the vector of dependent variables of size $(n \times 1)$ .
$X$ is the matrix of independent variables of size $(n \times n k)$ .
$Z$ is the location information that contains elements $Z_{x i}, Z_{y i} with i = 1, \dots, n$ representing the latitude and longitude of each observation, of size $(n k \times 2 n k)$ .
$J$ is the expansion of the identity matrix of size $(2 n k \times 2 k)$ .
$β$ is the matrix of size $(n k \times 1)$ contains parameter estimators for all explanatory $k$ variables at each observation.
$β_{0}$ is the parameter expressed by $β_{l a t i t u d e}, β_{l o n g i t u d e}$ of size $(2 k \times 1)$ .
$\otimes$ is the Kronecker product.
$ε$ is the error vector of size $(n \times 1)$ .
$s_{i}$ is the location matrix with $i = 1, \dots, n$ .

The matrix form of the model can be written as follows:

y = (\begin{matrix} y (s_{1}) \\ \begin{array}{l} y (s_{2}) \\ ⋮ \end{array} \\ y (s_{n}) \end{matrix}), X = (\begin{matrix} x_{11} & \dots & x_{1 k} & 0 & \dots & 0 & \dots & 0 & \dots & 0 \\ 0 & \dots & 0 & x_{21} & \dots & x_{2 k} & 0 & ⋮ & \dots & ⋮ \\ ⋮ & ⋱ & ⋮ & 0 & ⋱ & 0 & ⋱ & 0 & ⋱ & 0 \\ 0 & \dots & 0 & 0 & \dots & 0 & 0 & x_{n 1} & \dots & x_{n k} \end{matrix}), β = (\begin{matrix} β_{1} (s_{1}) \\ β_{1} (s_{2}) \\ ⋮ \\ β_{k} (s_{n}) \end{matrix}), ε = (\begin{matrix} ε (s_{1}) \\ \begin{array}{l} ε (s_{2}) \\ ⋮ \end{array} \\ ε (s_{n}) \end{matrix}), β_{0} = (\begin{matrix} β_{l a t i t u d e} \\ β_{l o n g i t u d e} \end{matrix}) Z = (\begin{matrix} Z_{x 1} \otimes I_{k} & Z_{y 1} \otimes I_{k} & 0 & 0 & 0 & 0 \\ 0 & 0 & ⋱ & ⋱ & 0 & 0 \\ 0 & 0 & 0 & 0 & Z_{x n} \otimes I_{k} & Z_{y n} \otimes I_{k} \end{matrix}), J = (\begin{matrix} I_{k} & 0 \\ 0 & I_{k} \\ ⋮ & ⋮ \\ 0 & I_{k} \end{matrix}) .

2.5. Spatial Durbin Model with Expansion Using Casetti’s Approach

By integrating the use of SDM with expansion using Casetti’s approach, replace Equation (10) to Equation (8), to obtain the following:

y = ρ W y + α 1_{n} + X Z J β_{0} + W \tilde{X} θ + ε with ε \overset{i i d}{\sim} N (0, σ^{2} I) .

(11)

Letting

A = X Z J

, it follows that:

y = ρ W y + α 1_{n} + A β_{0} + W \tilde{X} θ + ε with ε \overset{i i d}{\sim} N (0, σ^{2} I),

(12)

y = ρ W y + U δ + ε,

(13)

where

U = [\begin{matrix} 1_{n} & A & W \tilde{X} \end{matrix}]

, and

δ = [\begin{matrix} α \\ β_{0} \\ θ \end{matrix}]

.

2.6. Parameter Estimation

The random error variable in the SDM with Expansion using Casetti’s approach is assumed to follow a normal distribution. Consequently, parameter estimation in this model adopts the SAR-X estimation method, employing the Maximum Likelihood Estimation (MLE) technique [16]. Notable sources contributing to this approach include Ord [26], Smirnov and Anselin [27], Robinson and Rossi [28], and Feng [29]. Equation (13) can be expressed as follows:

y = ρ W y + U δ + ε with ε \overset{i i d}{\sim} N (0, σ^{2} I),

ε = y - ρ W y - U δ .

(14)

The probability density function used is expressed as follows:

f (y) = {(\frac{1}{2 π σ^{2}})}^{\frac{n}{2}} \exp [- \frac{{(y - ρ W y - U δ)}^{T} (y - ρ W y - U δ)}{2 σ^{2}}] .

(15)

The likelihood function of the dependent variable

y

is formulated as follows:

\begin{matrix} L (ρ, δ | y) & = f (y | ρ, δ) \\ = {(\frac{1}{2 π σ^{2}})}^{\frac{n}{2}} \exp [- \frac{{(y - ρ W y - U δ)}^{T} (y - ρ W y - U δ)}{2 σ^{2}}] . \end{matrix}

(16)

Furthermore, the log-likelihood function is obtained as:

\begin{matrix} \ln L (ρ, δ| ε) & = \ln {(\frac{1}{2 π σ^{2}})}^{\frac{n}{2}} \exp [- \frac{{(y - ρ W y - U δ)}^{T} (y - ρ W y - U δ)}{2 σ^{2}}] \\ = - \frac{n}{2} \ln (2 π) - \frac{n}{2} \ln σ^{2} - \frac{{(y - ρ W y - U δ)}^{T} (y - ρ W y - U δ)}{2 σ^{2}} . \end{matrix}

(17)

Parameter estimation of

ρ a n d δ

are obtained by maximizing the log-likelihood function.

To obtain the maximum likelihood estimate

\hat{ρ}

, we take the first derivative of Equation (17) with respect to

ρ

as follows:

\begin{matrix} \frac{\partial \ln L (ρ, δ| ε)}{\partial δ} & = - \frac{{(y - ρ W y - U δ)}^{T} (- W y)}{2 σ^{2}} \\ = \frac{{(y - ρ W y - U δ)}^{T} (W y)}{2 σ^{2}}, \end{matrix}

(18)

Taking

{\frac{\partial \ln L (ρ, δ| ε)}{\partial ρ}|}_{ρ = \hat{ρ}}

= 0, it follows that

\frac{{(y - ρ W y - U δ)}^{T} (W y)}{2 σ^{2}} = 0 .

(19)

In Equation (19), we multiply by

2 σ^{2} {(W y)}^{T}

to obtain

{(y - ρ W y - U δ)}^{T} (W y {(W y)}^{T}) = 0 .

(20)

Then, we multiply Equation (20) by

{(W y {(W y)}^{T})}^{- 1}

to obtain

{(y - ρ W y - U δ)}^{T} = 0 .

(21)

By transposing Equation (21), we obtain

y - U δ = ρ W y .

(22)

Multiplying both sides of Equation (22) by

{(W y)}^{T}

results in

{(W y)}^{T} (y - U δ) = {(W y)}^{T} ρ W y .

(23)

Furthermore, multiplying both sides of Equation (23) by

{({(W y)}^{T} W y)}^{- 1}

allows us to obtain

{({(W y)}^{T} W y)}^{- 1} {(W y)}^{T} (y - U δ) = {({(W y)}^{T} W y)}^{- 1} ({(W y)}^{T} W y) ρ,

(24)

\hat{ρ} = {({(W y)}^{T} W y)}^{- 1} {(W y)}^{T} (y - U δ) .

(25)

There is a constraint that must be applied to the parameter

ρ

. It was noted by Anselin and Florax (1994) that this parameter can have realistic values in the following range [30]:

\frac{1}{λ_{\min}} < ρ < \frac{1}{λ_{\max}},

where

λ_{\min}

represents the minimum eigenvalue of the standardized spatial contiguity matrix W and

λ_{\max}

denotes the largest eigenvalue of this matrix. This requires that the optimization search is constrained to values of

ρ

within the range [19].

To obtain the maximum likelihood estimate

\hat{δ}

, take the first derivative of Equation (17) with respect to

δ

as follows:

\begin{matrix} \frac{\partial \ln L (ρ, δ| ε)}{\partial δ} & = - \frac{{(y - ρ W y - U δ)}^{T} (- U)}{2 σ^{2}} \\ = \frac{{(y - ρ W y - U δ)}^{T} (U)}{2 σ^{2}} . \end{matrix}

(26)

Set

{\frac{\partial \ln L (ρ, δ| ε)}{\partial δ}|}_{δ = \hat{δ}} = 0

, to have the equation that follows:

\frac{{(y - ρ W y - U δ)}^{T} U}{2 σ^{2}} = 0 .

(27)

In Equation (27), multiply with

2 σ^{2} U^{T}

in order to obtain

{(y - ρ W y - U δ)}^{T} U U^{T} = 0,

(28)

We multiply Equation (28) by

{(U U^{T})}^{- 1}

, then obtain

{(y - ρ W y - U δ)}^{T} = 0 .

(29)

Transposing Equation (29) results in the expression below:

U δ = y - ρ W y .

(30)

Multiplying both sides of Equation (30) by

U^{T}

results in

(U δ) U^{T} = (y - ρ W y) U^{T} .

(31)

We multiply both sides of Equation (31) by

{(U U^{T})}^{- 1}

to obtain the following expression:

{(U U^{T})}^{- 1} (U U^{T}) δ = {(U U^{T})}^{- 1} U^{T} (y - ρ W y),

(32)

\hat{δ} = {(U U^{T})}^{- 1} U^{T} (y - ρ W y) .

(33)

2.7. Mean Absolute Percentage Error (MAPE)

To evaluate the model’s performance, the Mean Absolute Percentage Error (MAPE) is calculated as follows:

M A P E = (\frac{1}{n} \sum_{i = 1}^{n} |\frac{y (s_{i}) - \hat{y} (s_{i})}{y (s_{i})}|) \times 100 %,

(34)

with

y (s_{i})

are the values in the actual data at the location

s_{i}

,

\hat{y} (s_{i})

are the values in the prediction data at the location

s_{i}

, and

n

is the number of observation locations. According to Lawrence’s criteria (2009) [31], MAPE values can be categorized as follows: MAPE < 10% indicates very accurate prediction, 10–20% indicates good prediction, 20–50% indicates reasonable prediction, and MAPE > 50% indicates inaccurate prediction.

2.8. Knowledge Discovery in Databases Methodology

The fast growth of data accumulation, propelled by advancements in science and technology, results in the prevalence of vast datasets stored in centralized databases. Big data encompass the management of extensive data repositories, exemplified by the NASA POWER database which archives comprehensive global climate data collected over time. The defining attributes of big data are encapsulated in the five Vs framework: veracity, variety, velocity, volume, and value [32].

Data mining techniques are pivotal in extracting valuable insights from big data, a process integral to Knowledge Discovery in Databases (KDD). Key challenges such as scalability for handling large datasets, managing high dimensionality and complexity, addressing ownership concerns, and enabling non-traditional analytical approaches drive advancements in data mining methodologies [33]. In this study, data mining serves both descriptive and predictive roles within the KDD framework, covering three fundamental stages: preprocessing, data mining, and post-processing.

3. Real Data Application

3.1. Research Location

In this study, we chose Java Island as one of the biggest islands in Indonesia. Administratively, as seen in Figure 2, Java Island has six provinces with 119 districts covering an area of about 128,297 square kilometers. In addition, Java is one of Indonesia’s most densely populated islands. It has a variety of terrain, including plains, mountains, and coastal regions. The choice of Java Island for this study is based on its heavy rainfall and frequent landslides and floods. The island is located near the equator, which has a tropical environment with heavy rains all year. On the other hand, Java Island has significant climatic diversity, with high variations in rainfall from one region to another. We implement our proposed technique, the SDM with Casetti expansion, to predict rainfall on Java Island so that we can make a better plan to reduce the risk of disasters caused by the rainfall problem.

3.2. Data Description

The objective of this study was to apply the Spatial Durbin Model (SDM) with expansion using Casetti’s approach to analyze climate data specific to Java Island, Indonesia. The study utilized secondary data sourced from the NASA POWER observation satellite, accessible through the data access viewer at https://power.larc.nasa.gov/data-access-viewer/, accessed on 5 July 2023). Notably, NASA POWER data are accessible under a General Public License, facilitating widespread usability. The dataset is extensive, comprising 6636.982 terabytes of recorded data (https://disc.gsfc.nasa.gov/, accessed 13 June 2024), encompassing domains such as agroclimatology, renewable energy, and sustainable buildings. Data availability spans various temporal resolutions, including hourly, daily, monthly, yearly, and climatology data in various formats including ASCII, CSV, GeoJSON, ICASA, and NetCDF.

The accessible parameters included solar fluxes and related temperatures/thermal iR flux, humidity/precipitation, wind/pressure, and soil properties. During the data collection phase, a rigorous parameter selection process was conducted to gather climatic variables specific to the Java Island region, encompassing six provinces and 119 districts/cities from January 1984 to July 2023, recorded on a monthly basis. The selected climate variables comprised rainfall, air temperature, humidity, solar irradiation, surface pressure, and geographical coordinates (latitude and longitude). The collected data were stored in comma-separated values (.csv) format.

This study focused specifically on precipitation parameters across Java Island’s 119 districts and cities using data accessed from the NASA POWER website, which provides Numerical Weather Prediction (NWP) data. Each coordinate point is represented by a grid resolution of (0.5 × 0.625)°, corresponding to approximately 50 km [34]. During data preprocessing, 55 duplicate locations were identified and removed, resulting in a dataset of 64 distinct districts/cities for subsequent predictive modeling, as detailed in Table 1. These distinct districts/cities can be further classified into six major provinces, namely, Banten, DKI Jakarta, Jawa Barat, Jawa Tengah, Daerah Istimewa Yogyakarta, and Jawa Timur. The comparison of climate variables across provinces in Indonesia reveals significant regional diversity in climate patterns. The average rainfall across the different cities/districts is 175.2 mm., with a range of 137.3 mm to 228.8 mm. However, rainfall shows considerable variation, with Jawa Barat and Jawa Tengah experiencing the widest range of levels, indicating distinct precipitation patterns across these provinces. Kota Tasikmalaya in the province of Jawa Barat has the highest reported rainfall and surface pressure. Surface pressure, ranging from 90.93 kPa to 100.96 kPa, with a mean of 97.90 kPa, remains generally stable with minor fluctuations observed among provinces. Humidity levels range from 75.46% to 89.10%, averaging 82.12%. Air temperature, varying between 21.36 °C and 27.90 °C, with an average of 25.50 °C, exhibits noticeable differences, with DKI Jakarta generally experiencing warmer conditions, while Jawa Barat tends to be cooler, and other provinces maintain moderate temperatures. Solar irradiation, which reflects sunlight exposure, shows notable variation across provinces in Indonesia. Banten exhibits lower levels of solar irradiation with a median of 17.63 W/m², while Jawa Timur displays higher levels with a median of 19.62 W/m². The remaining provinces demonstrate moderate to high exposure to sunlight. Among the cities/districts, Kab Situbondo of the province of Jawa Timur has the highest solar irradiation, but lowest rainfall level and surface pressure.

3.3. RShiny for SDM with Expansion Using Casetti’s Approach

The prediction procedure described in Section 2 was implemented through the development of an RShiny web application for the Spatial Durbin Model (SDM) with expansion using Casetti’s approach. This application is accessible via the following link: https://andriyanafalah.shinyapps.io/SDM-Expansion/ (accessed on 11 June 2024). The source code for the application is provided in Appendix B. The application features six dashboards, each serving a distinct purpose:

Description of model: This section explains the formulation of the SDM with expansion using Casetti’s approach.
Import data: In this section, users can upload data files in .csv or .txt format. The data should contain the coordinate of location (latitude and longitude), dependent variable and exogenous variables.
Vector and matrix
Construct vectors and matrices based on the data, as follows:
- a.
  Vector $y$ defines the dependent variable at each location.
  b.
  Matrix $X$ and $\tilde{X}$ represent the exogenous variables.
  c.
  Matrix $Z$ consists of location coordinate entries in latitude and longitude.
  d.
  Matrix $J$ is an identity matrix with a size of as many as four exogenous variables according to the matrix $X$ or matrix $\tilde{X}$ .
  e.
  Matrix $W$ is the result of calculating the inverse distance weight matrix using the equation with input location coordinates (latitude and longitude).
  f.
  Kronecker $Z$ is the expression obtained from the multiplication of the Kronecker with the identity matrix of exogenous variables $(Z \otimes I_{k})$ .
  g.
  Matrix $A$ is the product of matrix $X$ , $Z$ , and $J$ .
  h.
  Matrix $U$ is the combination of vector 1, matrix $A$ , and the product of matrix $W$ and $\tilde{X}$ .
Result of prediction: This includes the calculation of results of parameter estimation $\hat{ρ}$ , $\hat{α}$ , ${\hat{β}}_{0}$ , and $\hat{θ}$ obtaining the prediction results $\hat{y}$ , absolute error, and MAPE.
Download data: This menu allows users to download the prediction calculation data.
Created by: This section lists the names of the RShiny development team.

3.4. KDD for SDM with Expansion Using Casetti’s Approach

The research flowchart for the SDM with expansion using Casetti’s approach applied to climate data in Java Island, Indonesia, using the Knowledge Discovery in Databases (KDD) methodology, is depicted in Figure 3. Climate data from NASA POWER undergo a data preprocessing stage via an RShiny web application, accessible at https://annisanurfalah.shinyapps.io/Pre-ProcessingData/ (accessed on 26 September 2023). This stage includes variable selection, time transformation from daily to monthly data, and the removal of missing values and duplicates. As a result, climate variable data are obtained for 64 districts/cities as cross-sectional data. Furthermore, the pre-processed data are subsequently utilized as input for the data mining process. The results from the preprocessing stage are used to construct the inverse distance weight matrix, and spatial autocorrelation is assessed using Moran’s Index and Scatterplot. If spatial autocorrelation is detected, the process continues with the SDM with expansion using Casetti’s approach, employing the Maximum Likelihood Estimation (MLE) method. Parameter estimation for the SDM with expansion using Casetti’s approach is then calculated, followed by an evaluation of prediction accuracy using the Mean Absolute Percentage Error (MAPE). The prediction results are post-processed by visualizing spatial mapping through choropleth maps and providing interpretations to derive valuable insights.

3.5. Result of Moran’s Index and Scatterplot

The Moran Index was computed to assess the presence of spatial autocorrelation between the observation sites. Equation (4) was used to compute the Moran Index for each climate variable. The results of these calculations are presented in Table 2.

Based on Table 2, the p-values are all less than the significance level (5% even for 1%), indicating the presence of spatial autocorrelation. This suggests that the proximity between locations influences the climate variables. In addition to the Moran Index calculation, each region’s unique characteristics and the overall clustering tendency are shown using a Moran Scatterplot. For every unit of analysis, a four-quadrant graph serves as the visual representation. The mean and average lines delineate the four quadrants, which represent possible groupings. When an area’s numbers were higher than the average, it was seen to have high traits; when they were lower than the average, it was thought to have poor attributes. Figure 4 displays all of the Moran Scatterplot results for every climate variable.

Based on the results of the spatial autocorrelation test presented in Table 2 and the Moran Scatterplots depicted in Figure 4, a positive spatial autocorrelation is evident for all climate variables—rainfall, air temperature, humidity, solar irradiation, and surface pressure—across the 64 districts/cities of Java Island. Figure 4 clearly indicates this positive spatial autocorrelation, as shown by the increasing trends in quadrant I (high–high) and quadrant iII (low–low). This suggests that districts/cities with high values of climate variables are in spatial proximity to other districts and cities with similarly high values. Conversely, districts/cities with low values of climate variables tend to be situated near other districts/cities with low values of these variables.

3.6. Prediction Result of SDM with Expansion Using Casetti’s Approach

The estimation of prediction parameters in the SDM with expansion using Casetti’s Approach was conducted via the RShiny web application. An estimated

\hat{ρ}

value of 0.999 was obtained, producing an optimum spatial lag with a positive value

(\hat{ρ} > 0)

and indicative of a spatial lag dependence. This

\hat{ρ}

value signifies the influence of adjacent locations within Java Island on rainfall prediction data. It further highlights the spatial autocorrelation of rainfall on Java Island, indicating that districts and cities with high rainfall levels tend to be spatially clustered with others exhibiting similar high rainfall levels, or with minimal variation. The results of the parameter estimate calculation

{\hat{β}}_{0}

and

\hat{θ}

are shown in Table 3.

The estimate

{\hat{β}}_{0}

measures the direct impact of the exogenous variables on rainfall levels within the same region, while the estimate

\hat{θ}

captures the spillover effects of the exogenous variables on rainfall levels. Based on the estimate

{\hat{β}}_{0}

, we can derive the estimate

\hat{β}

, from which individual parameter estimates were derived for each exogenous variable across the 64 districts and cities. The SDM with expansion using Casetti’s approach produces different parameter estimates for each exogenous variable at each location due to the expansion of the exogenous variable matrix involving latitude and longitude information at each location. Detailed equations for the SDM with expansion using Casetti’s approach at each location are provided in Appendix A. A visualization of rainfall prediction in 64 districts/cities in Java Island is shown in Figure 5.

Based on Figure 5, the highest monthly rainfall predictions with values above 210 mm are in Pangandaran, Tasikmalaya City, Cilacap, Garut, Bandung and Bandung City. While the lowest monthly rainfall predictions with values below 140 mm are in Lumajang, Jember, Probolinggo, Probolinggo City, Situbondo, and Bondowoso. The proposed model has a MAPE value of 2.78%, indicating very accurate prediction. As observed from Section 3.2, these results are consistent with the rainfall patterns observed across cities/districts and provinces in Java Island.

4. Discussion

The Spatial Durbin Model (SDM) with expansion using Casetti’s approach was applied to analyze rainfall and climate variables across 64 districts/cities in Java Island, Indonesia, which is characterized by a monsoonal rainfall pattern. The prediction calculation employed an integrated R script developed for the SDM with expansion using Casetti’s approach, implemented through the RShiny web application, which comprises six interactive dashboards. The comprehensive source code is archived in a GitHub repository, with the link provided in Appendix B. This resource is instrumental for facilitating the replication and validation of both the model framework and findings by fellow researchers [35,36].

Moran’s Index and Scatterplots indicate that there is positive spatial autocorrelation among climate variables (rainfall, surface pressure, humidity, air temperature, and solar irradiation) across the 64 districts/cities of Java Island. This finding reveals that districts/cities with high values for climate variables tend to be spatially clustered with others exhibiting similarly high values, and vice versa. Previous research utilizing SAR-X with Casetti’s approach, as discussed by [16], did not account for spatial dependencies of exogenous variables, potentially leading to suboptimal results. To address this limitation, it is recommended to integrate these dependencies into the analysis through the Spatial Durbin Model (SDM). The model’s accuracy is supported by the precise estimation of parameter values assigned to each climate variable, contributing significantly to the high precision of the prediction results. Different parameter estimates were derived for each exogenous variable across the 64 districts/cities. This highlights that the SDM with expansion using Casetti’s approach generates distinct parameter estimates for each exogenous variable at each location. This is attributed to the expansion of the exogenous variable matrix involving latitude and longitude information at each location. The results showed the remarkable accuracy of the prediction outcomes obtained through the model at each observation location. The proposed SDM with expansion using Casetti’s approach resulted in a very accurate rainfall prediction with an MAPE of 2.78%. The summary of this study underscored that the level of rainfall in each region, based on the data from 64 districts/cities in Java Island, was significantly influenced by the other climate variable factors, such as air temperature, humidity, solar irradiation, and surface pressure [17].

5. Conclusions

In conclusion, this research underscored that the level of rainfall in each region, based on data from 64 districts/cities in Java Island, Indonesia, was significantly influenced by four other climate variable factors. Air temperature significantly impacts rainfall; the interactions are complex and can increase and decrease depending on the specific conditions and regions involved. Humidity is a critical factor in the formation and intensity of rainfall; the interplay between humidity and other atmospheric conditions determines the overall rainfall patterns in different regions. The effects of solar irradiation are integral to understanding diurnal and seasonal rainfall patterns and larger-scale climate phenomena such as monsoons. The interaction between high and low surface pressure systems and the resulting wind and moisture transport plays a critical role in determining rainfall patterns. It should be noted that these factors collectively exhibited a positive spatial autocorrelation across various areas. Climatological data most likely consist of extreme values. However, this can be addressed by developing a technique that uses a more robust objective function.

Accordingly, it is recommended that relevant institutions involved in rainfall prediction consider the multifaceted aspects of climate variables comprehensively. The modelling approach presented in this study facilitated a thorough understanding of how each climate variable influences specific locations. The findings from this study are anticipated to inform strategic decision-making at institutions such as BMKG and agribusiness companies, to enhance agricultural planning, optimize planting seasons, and disseminating crucial and valuable climate information to the general public, particularly regarding rainfall patterns.

6. Patents

Granted Copy Right: Copy Right for Computer Program, number 000624157.

Entitled “RShiny Web Application for Spatial Durbin Model with Expansion using Casetti’s Approach”, Ministry of Law and Human Rights of the Republic of Indonesia (Falah, A. N., Andriyana, Y.), 2024.

https://andriyanafalah.shinyapps.io/SDM-Expansion/ (accessed on 11 June 2024).

Author Contributions

Conceptualization, Y.A. and A.N.F.; methodology, Y.A., A.N.F., B.N.R. and D.L.L.-P.; software, Y.A. and A.N.F.; validation, Y.A., B.N.R., A.S., E.H., T.H. and D.L.L.-P.; formal analysis, Y.A., A.N.F., B.N.R., A.S., E.H., T.H. and D.L.L.-P.; investigation, Y.A. and B.N.R.; resources Y.A., B.N.R., A.S., E.H., T.H., D.L.L.-P. and Y.A.; data curation A.S., E.H. and T.H.; writing—original draft preparation, Y.A., A.N.F. and D.L.L.-P.; writing—review and editing, Y.A., A.N.F., B.N.R., A.S., E.H., T.H. and D.L.L.-P.; supervision, Y.A., B.N.R., A.S., E.H., T.H. and D.L.L.-P.; project administration Y.A. and A.N.F.; funding acquisition, Y.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors gratefully acknowledge Universitas Padjadjaran for the financial support provided through the Postdoctoral Research Grant scheme with the contract number 2413/UN6.3.1/PT.00/2024, RDPD Research Grant scheme with the contract number 2096/UN.6.3.1/PT.00/2024 and the Fundamental Research Grant with the contract number 4039/UN6.3.1/PT.00/2024 from the Ministry of Research, Technology, and Higher Education of Indonesia (Kemendikbudristek).

Data Availability Statement

The data will be made available by the authors on request.

Acknowledgments

The authors gratefully acknowledge to the National Research and Innovation Agency (BRIN), Academic Leadership Grant Unpad 2024 and RISE_SMA project of European Union 2019-2024 for their assistance in conducting this research. Additionally, the valuable comments and suggestions provided by the reviewers for this paper are greatly appreciated.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The SDM with Expansion using Casetti’s approach equation for predicting rainfall in 64 districts/cities of Java Island, Indonesia, as follows:

No	Locations	SDM with Expansion Using Casetti’s Approach for Predicting Rainfall
1	Serang City	$\begin{matrix} {\hat{y}}_{(s_{1})} & = 0.999 \sum_{i = 1}^{64} w_{1 i} y_{(s_{1})} + 481.319 \times 1_{(s_{1})} - 5.626 X_{1} - 7.275 X_{2} - 2.606 X_{3} - 11.649 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{matrix}$
2	Pandeglang	$\begin{array}{l} {\hat{y}}_{(s_{2})} & = 0.999 \sum_{i = 1}^{64} w_{2 i} y_{(s_{2})} + 481.319 \times 1_{(s_{2})} + 15.479 X_{1} + 19.166 X_{2} + 9.044 X_{3} + 28.659 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
3	Tangerang City	$\begin{array}{l} {\hat{y}}_{(s_{3})} & = 0.999 \sum_{i = 1}^{64} w_{3 i} y_{(s_{3})} + 481.319 \times 1_{(s_{3})} + 4.452 X_{1} + 4.993 X_{2} + 3.745 X_{3} + 6.173 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
4	South Tangerang City	$\begin{array}{l} {\hat{y}}_{(s_{4})} & = 0.999 \sum_{i = 1}^{64} w_{4 i} y_{(s_{4})} + 481.319 \times 1_{(s_{4})} - 1.688 X_{1} - 0.339 X_{2} - 4.845 X_{3} + 3.857 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
5	Kepulauan Seribu	$\begin{array}{l} {\hat{y}}_{(s_{5})} & = 0.999 \sum_{i = 1}^{64} w_{5 i} y_{(s_{5})} + 481.319 \times 1_{(s_{5})} - 5.151 X_{1} - 7.316 X_{2} + 0.429 X_{3} - 10.069 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
6	Central Jakarta	$\begin{array}{l} {\hat{y}}_{(s_{6})} & = 0.999 \sum_{i = 1}^{64} w_{6 i} y_{(s_{6})} + 481.319 \times 1_{(s_{6})} + 14.456 X_{1} + 19.245 X_{2} + 2.389 X_{3} + 25.193 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
7	Bekasi City	$\begin{array}{l} {\hat{y}}_{(s_{7})} & = 0.999 \sum_{i = 1}^{64} w_{7 i} y_{(s_{7})} + 481.319 \times 1_{(s_{7})} + 4.330 X_{1} + 4.996 X_{2} + 2.867 X_{3} + 5.715 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
8	Bogor City	$\begin{array}{l} {\hat{y}}_{(s_{8})} & = 0.999 \sum_{i = 1}^{64} w_{8 i} y_{(s_{8})} + 481.319 \times 1_{(s_{8})} - 2.159 X_{1} - 0.281 X_{2} - 7.615 X_{3} + 2.418 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
9	Indramayu	$\begin{array}{l} {\hat{y}}_{(s_{9})} & = 0.999 \sum_{i = 1}^{64} w_{9 i} y_{(s_{9})} + 481.319 \times 1_{(s_{9})} - 6.084 X_{1} - 3.770 X_{2} - 8.221 X_{3} - 11.476 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
10	Karawang	$\begin{array}{l} {\hat{y}}_{(s_{7})} & = 0.999 \sum_{i = 1}^{64} w_{7 i} y_{(s_{7})} + 481.319 \times 1_{(s_{7})} + 16.503 X_{1} + 11.497 X_{2} + 21.353 X_{3} + 28.294 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$
…	…	…
64	Ponorogo	$\begin{array}{l} {\hat{y}}_{(s_{64})} & = 0.999 \sum_{i = 1}^{64} w_{64 i} y_{(s_{64})} + 481.319 \times 1_{(s_{64})} - 0.259 X_{1} - 4.138 X_{2} - 6.511 X_{3} - 0.839 X_{4} + 11.969 \sum_{i = 1}^{64} w_{1 i} X_{1} \\ - 0.227 \sum_{i = 1}^{64} w_{1 i} X_{2} - 6.869 \sum_{i = 1}^{64} w_{1 i} X_{3} - 8.393 \sum_{i = 1}^{64} w_{1 i} X_{4} \end{array}$

Appendix B

The source code is stored in the GitHub directory and can be accessed at the following link: https://github.com/AndriyanaFalah/SDM-Expansion (accessed on 8 June 2024).

References

Gunawan, D. Peta Curah Hujan Ekstrem Indonesia Periode 1991–2020; BMKG: Jakarta, Indonesia, 2022. [Google Scholar]
Nurlatifah, A.; Hatmaja, R.B.; Rakhman, A.A. Analisis Potensi Kejadian Curah Hujan Ekstrem di Masa Mendatang Sebagai Dampak dari Perubahan Iklim di Pulau Jawa Berbasis Model Iklim Regional CCAM. J. Ilmu Lingkung. 2023, 21, 980–986. [Google Scholar] [CrossRef]
Kasa, A.N.G. Buletin Informasi Iklim November, No. 2. 2023; BMKG: Jakarta, Indonesia, 2023. [Google Scholar]
Kartika, Q.A.; Faqih, A.; Santikayasa, I.P.; Setiawan, A.M. Sea Surface Temperature Anomaly Characteristics Affecting Rainfall in Western Java, Indonesia. Agromet 2023, 37, 54–65. [Google Scholar] [CrossRef]
Aldrian, E.; Susanto, R.D. Identification of three dominant rainfall regions within Indonesia and their relationship to sea surface temperature. Int. J. Clim. 2003, 23, 1435–1452. [Google Scholar] [CrossRef]
Lee, H.S. General Rainfall Patterns in Indonesia and the Potential Impacts of Local Seas on Rainfall Intensity. Water 2015, 7, 1751–1768. [Google Scholar] [CrossRef]
Aldrian, E. Pemahaman Dinamika Iklim di Negara Kepulauan Indonesia Sebagai Modalitas Ketahanan Bangsa. J. Ilmu Pertan. Indones. 2016, 27, 606–613. [Google Scholar]
BMKG. Pemutakhiran Zona Musim Indonesia Periode 1991–2020; BMKG: Jakarta, Indonesia, 2022; p. 126. [Google Scholar]
SDGs Indonesia. Sustainable Development Goals (SDGs)-Tujuan 13; SDGs Indonesia: Jakarta, Indonesia, 2021. [Google Scholar]
Hatfield, G. Spatial statistics. In Practical Mathematics for Precision Farming; SSSA: Madison, WI, USA, 2018; pp. 75–104. [Google Scholar] [CrossRef]
Stohlgren, T.J. Spatial Analysis and Modeling. In Measuring Plant Diversity: Lessons from the Field; Oxford Academic: Oxford, UK, 2007; pp. 254–270. [Google Scholar] [CrossRef]
Triyatno, T.; Berd, I.; Idris. Spatial Model of Flood Hazard Due To Land Cover Change in the Tarusan Watershed, West Sumatra—Indonesia. Int. J. GEOMATE 2023, 25, 21–29. [Google Scholar] [CrossRef]
Bonsoms, J.; Ninyerola, M. Comparison of linear, generalized additive models and machine learning algorithms for spatial climate interpolation. Theor. Appl. Clim. 2024, 155, 1777–1792. [Google Scholar] [CrossRef]
Anna, A.N.; Fikriyah, V.N.; Ibrahim, M.H.; Ismail, K.; Pamekar, M.S.; Asshodiq, A.D.T. Spatial Modelling of Local Flooding for Hazard Mitigation in Surakarta, Indonesia. Int. J. GEOMATE 2021, 21, 145–152. [Google Scholar] [CrossRef]
Jalbert, J.; Genest, C.; Perreault, L. Interpolation of Precipitation Extremes on a Large Domain Toward IDF Curve Construction at Unmonitored Locations. J. Agric. Biol. Environ. Stat. 2022, 27, 461–486. [Google Scholar] [CrossRef]
Falah, A.N.; Ruchjana, B.N.; Abdullah, A.S.; Rejito, J. The Hybrid Modeling of Spatial Autoregressive Exogenous Using Casetti’ s Model Approach for the Prediction of Rainfall. Mathematics 2023, 11, 3783. [Google Scholar] [CrossRef]
Hermawan, E.; Lubis, S.W.; Harjana, T.; Purwaningsih, A.; Risyanto; Ridho, A.; Andarini, D.F.; Ratri, D.N.; Widyaningsih, R. Large-Scale Meteorological Drivers of the Extreme Precipitation Event and Devastating Floods of Early-February 2021 in Semarang, Central Java, Indonesia. Atmosphere 2022, 13, 1092. [Google Scholar] [CrossRef]
Anselin, L. Spatial Econometrics: Methods and Models. J. Am. Stat. Assoc. 1988, 85, 905. [Google Scholar]
LeSage, J. Spatial Econometrics Toolbox. In A Companion to Theoretical Econometrics; Blackwell: Oxford, UK, 1999; p. 273. [Google Scholar]
Lu, G.Y.; Wong, D.W. An adaptive inverse-distance weighting spatial interpolation technique. Comput. Geosci. 2008, 34, 1044–1055. [Google Scholar] [CrossRef]
Maria, E.; Budiman, E.; Haviluddin; Taruk, M. Measure distance locating nearest public facilities using Haversine and Euclidean Methods. J. Phys. Conf. Ser. 2020, 1450, 12080. [Google Scholar] [CrossRef]
Rosmanah, R.; Djara, V.A.D.; Andriyana, Y.; Jaya, I.G.N.M. The spatial econometrics of economic growth in Sumatera Utara province. J. Math. Comput. Sci. 2022, 12, 7283. [Google Scholar] [CrossRef]
Notonegoro, Y.; Andriyana, Y.; Ruchjana, B.N. Comparison of distance-based spatial weight matrix in modeling Internet signal strengths in Tasikmalaya regency using logistic spatial autoregressive model. Int. J. Data Netw. Sci. 2024, 8, 893–906. [Google Scholar] [CrossRef]
Kopczewska, K. Applied Spatial Statistics and Econometrics; Routledge: London, UK, 2020. [Google Scholar]
Lesage, J.; Pace, R.K. Introduction to Spatial Econometrics; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
Ord, K. Estimation Methods for Models of Spatial Interaction. J. Am. Stat. Assoc. 1975, 70, 120. [Google Scholar] [CrossRef]
Smirnov, O.; Anselin, L. Fast maximum likelihood estimation of very large spatial autoregressive models: A characteristic polynomial approach. Comput. Stat. Data Anal. 2001, 35, 301–319. [Google Scholar] [CrossRef]
Robinson, P.M.; Rossi, F. Refinements in maximum likelihood inference on spatial autocorrelation in panel data. J. Econ. 2015, 189, 447–456. [Google Scholar] [CrossRef]
Qiu, F.; Ding, H.; Hu, J. Asymptotic Properties of Quasi-Maximum Likelihood Estimators for Heterogeneous Spatial Autoregressive Models. Symmetry 2022, 14, 1894. [Google Scholar] [CrossRef]
Ezcurra, R.; Le Gallo, J.; Abate, G.D.; Chasco, C.; González-Val, R.; Elhorst, J.P. Theory and Practice of Spatial Econometrics. Spat. Econ. Anal. 2015, 10, 400. [Google Scholar] [CrossRef]
Lawrence, K.D.; Klimberg, R.K.; Lawrence, S.M. Fundamentals of Forecasting Using Excel; Industrial Press Inc.: New York, NY, USA, 2009. [Google Scholar]
Ishwarappa; Anuradha, J. A brief introduction on big data 5Vs characteristics and hadoop technology. Procedia Comput. Sci. 2015, 48, 319–324. [Google Scholar] [CrossRef]
Rohit, R.; Kapil, N.K.; Sandeep, K.; Ramya, L.K. Data Mining and Machine Learning Applications; Wiley-Scrivener: New York, NY, USA, 2022. [Google Scholar]
Qin, Y.; Zhang, P.; Liu, W.; Guo, Z.; Xue, S. The application of elevation corrected MERRA2 reanalysis ground surface temperature in a permafrost model on the Qinghai-Tibet Plateau. Cold Reg. Sci. Technol. 2020, 175, 103067. [Google Scholar] [CrossRef]
Zhang, C.; Hu, C.; Wu, T.; Zhu, L.; Liu, X. Achieving Efficient and Privacy-Preserving Neural Network Training and Prediction in Cloud Environments. IEEE Trans. Dependable Secur. Comput. 2022, 20, 4245–4257. [Google Scholar] [CrossRef]
Hu, C.; Zhang, C.; Lei, D.; Wu, T.; Liu, X.; Zhu, L. Achieving Privacy-Preserving and Verifiable Support Vector Machine Training in the Cloud. IEEE Trans. Inf. Forensics Secur. 2023, 18, 3476–3491. [Google Scholar] [CrossRef]

Figure 1. Rainfall Patterns in Indonesia. Source: Meteorology, Climatology, and Geophysics Agency (BMKG), Indonesia [8].

Figure 2. Research Location in Java Island, Indonesia.

Figure 3. KDD flowchart on the SDM with expansion using Casetti’s approach.

Figure 4. Moran’s Scatterplot of climate variables: (a) rainfall, (b) air temperature, (c) humidity, (d) solar irradiation, (e) surface pressure.

Figure 5. Visualization of rainfall prediction in 64 districts/cities in Java Island.

Table 1. Climate Variables Data of the 64 cities/districts of Java Island, Indonesia.

No.	Locations	Latitude	Longitude	Climate Variables (Averages)
No.	Locations	Latitude	Longitude	Rainfall (mm)	Air Temperature (°C)	Humidity (%)	Solar Irradiation (W/m²)	Surface Pressure (kPa)
1	Serang City	−6.15	106	166.53	27.08	81.47	17.63	100.51
2	Pandeglang	−6.3092	106.1047	173.91	25.49	84.50	17.63	98.27
3	Tangerang City	−6.171389	106.640556	175.80	27.20	81.44	17.63	100.81
4	South Tangerang City	−6.28577727	106.7122607	178.34	24.66	85.15	17.63	96.82
5	Kepulauan Seribu	−5.662900	106.568300	202.67	27.80	80.03	18.25	100.96
6	Central Jakarta	−6.170000	106.820000	175.80	27.20	81.44	17.63	100.81
7	Bekasi City	−6.241586	106.992416	175.80	27.20	81.44	17.63	100.81
8	Bogor City	−6.899541	107.533867	178.34	24.66	85.15	17.63	96.82
9	Indramayu	−6.327583	108.324936	184.87	26.40	80.98	18.75	99.66
10	Karawang	−6.32273	107.337579	190.09	25.25	83.46	18.28	97.86
…	…	…	…	…	…	…	…	…
64	Ponorogo	−7.8686	111.4619	163.89	24.98	82.55	19.48	97.43

Table 2. Spatial autocorrelation test for the climate variables.

No	Climate Variables	$I$	$E (I)$	$V a r (I)$	p-Value
1	$y$ (rainfall)	0.703	−0.016	0.004	$2.2 \times 10^{- 16}$ *
2	$X_{1}$ (air temperature)	0.376	−0.016	0.004	$2.084 \times 10^{- 10}$ *
3	$X_{2}$ (humidity)	0.527	−0.016	0.004	$2.2 \times 10^{- 16}$ *
4	$X_{3}$ (solar irradiation)	0.806	−0.015	0.005	$2.2 \times 10^{- 16}$ *
5	$X_{4}$ (surface pressure)	0.337	−0.016	0.004	$7.612 \times 10^{- 9}$ *

* Significant at

α = 0.05

.

Table 3. Parameter estimated value.

Coefficient	Parameter Estimated Value
Coefficient		${\hat{β}}_{0}$	$\hat{θ}$
$X_{1}$ (air temperature)	${\hat{β}}_{l a t i t u d e}$	−7.994	11.969
$X_{1}$ (air temperature)	${\hat{β}}_{l o n g i t u d e}$	17.524	11.969
$X_{2}$ (humidity)	${\hat{β}}_{l a t i t u d e}$	2.312	−0.227
$X_{2}$ (humidity)	${\hat{β}}_{l o n g i t u d e}$	7.295	−0.227
$X_{3}$ (solar irradiation)	${\hat{β}}_{l a t i t u d e}$	−0.570	−6.869
$X_{3}$ (solar irradiation)	${\hat{β}}_{l o n g i t u d e}$	−0.570	−6.869
$X_{4}$ (surface pressure)	${\hat{β}}_{l a t i t u d e}$	0.191	−8.393
$X_{4}$ (surface pressure)	${\hat{β}}_{l o n g i t u d e}$	0.456	−8.393

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Andriyana, Y.; Falah, A.N.; Ruchjana, B.N.; Sulaiman, A.; Hermawan, E.; Harjana, T.; Lim-Polestico, D.L. Spatial Durbin Model with Expansion Using Casetti’s Approach: A Case Study for Rainfall Prediction in Java Island, Indonesia. Mathematics 2024, 12, 2304. https://doi.org/10.3390/math12152304

AMA Style

Andriyana Y, Falah AN, Ruchjana BN, Sulaiman A, Hermawan E, Harjana T, Lim-Polestico DL. Spatial Durbin Model with Expansion Using Casetti’s Approach: A Case Study for Rainfall Prediction in Java Island, Indonesia. Mathematics. 2024; 12(15):2304. https://doi.org/10.3390/math12152304

Chicago/Turabian Style

Andriyana, Yudhie, Annisa Nur Falah, Budi Nurani Ruchjana, Albertus Sulaiman, Eddy Hermawan, Teguh Harjana, and Daisy Lou Lim-Polestico. 2024. "Spatial Durbin Model with Expansion Using Casetti’s Approach: A Case Study for Rainfall Prediction in Java Island, Indonesia" Mathematics 12, no. 15: 2304. https://doi.org/10.3390/math12152304

APA Style

Andriyana, Y., Falah, A. N., Ruchjana, B. N., Sulaiman, A., Hermawan, E., Harjana, T., & Lim-Polestico, D. L. (2024). Spatial Durbin Model with Expansion Using Casetti’s Approach: A Case Study for Rainfall Prediction in Java Island, Indonesia. Mathematics, 12(15), 2304. https://doi.org/10.3390/math12152304

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spatial Durbin Model with Expansion Using Casetti’s Approach: A Case Study for Rainfall Prediction in Java Island, Indonesia

Abstract

1. Introduction

2. Materials and Methods

2.1. Inverse Distance Weight Matrix

2.2. Moran Index

2.3. Spatial Durbin Model

2.4. Spatial Expansion with Casetti’s Approach

2.5. Spatial Durbin Model with Expansion Using Casetti’s Approach

2.6. Parameter Estimation

2.7. Mean Absolute Percentage Error (MAPE)

2.8. Knowledge Discovery in Databases Methodology

3. Real Data Application

3.1. Research Location

3.2. Data Description

3.3. RShiny for SDM with Expansion Using Casetti’s Approach

3.4. KDD for SDM with Expansion Using Casetti’s Approach

3.5. Result of Moran’s Index and Scatterplot

3.6. Prediction Result of SDM with Expansion Using Casetti’s Approach

4. Discussion

5. Conclusions

6. Patents

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI