Developing Relative Spatial Poverty Index Using Integrated Remote Sensing and Geospatial Big Data Approach: A Case Study of East Java, Indonesia

: Poverty data are usually collected through on-the-ground household-based socioeconomic surveys. Unfortunately, data collection with such conventional methods is expensive, laborious, and time-consuming. Additional information that can describe poverty with better granularity in scope and at lower cost, taking less time to update, is needed to address the limitations of the currently existing ofﬁcial poverty data. Numerous studies have suggested that the poverty proxy indicators are related to economic spatial concentration, infrastructure distribution, land cover, air pollution, and accessibility. However, the existing studies that integrate these potentials by utilizing multi-source remote sensing and geospatial big data are still limited, especially for identifying granular poverty in East Java, Indonesia. Through analysis, we found that the variables that represent the poverty of East Java in 2020 are night-time light intensity (NTL), built-up index (BUI), sulfur dioxide (SO 2 ), point-of-interest (POI) density, and POI distance. In this study, we built a relative spatial poverty index ( RSPI ) to indicate the spatial poverty distribution at 1.5 km × 1.5 km grids by overlaying those variables, using a multi-scenario weighted sum model. It was found that the use of multi-source remote sensing and big data overlays has good potential to identify poverty using the geographic approach. The obtained RSPI is strongly correlated (Pearson correlation coefﬁcient = 0.71 ( p -value = 5.97 × 10 − 7 ) and Spearman rank correlation coefﬁcient = 0.77 ( p -value = 1.58 × 10 − 8 ) to the ofﬁcial poverty data, with the best root mean square error (RMSE) of 3.18%. The evaluation of RSPI shows that areas with high RSPI scores are geographically deprived and tend to be sparsely populated with more inadequate accessibility, and vice versa. The advantage of RSPI is that it is better at identifying poverty from a geographical perspective; hence, it can be used to overcome spatial poverty traps.


Introduction
Poverty is a historical problem that almost all countries have not been able to clearly solve [1]. According to the UN, approximately 8.2% of people in the world were living in poverty in 2019 [2]. People are considered poor if they live on less than USD 1.90 a day [3]. To overcome this problem, the UN proposed to "end poverty in all its forms everywhere" as the first goal of their Sustainable Development Goals (SDGs); this goal is expected to be achieved by 2030 [4]. Consequently, the reduction of poverty has become a challenging task for all countries, especially for developing or less developed countries [5]. Indonesia is one of the developing countries that is facing poverty. According to Statistics Indonesia, locally known as Badan Pusat Statistik (BPS), there are approximately 10.14% or 27.54 million Indonesian people living in poverty as of March 2021 [6]. To reduce poverty, through the Indonesia SDGs Roadmap Toward 2030 [7], the government has set targets for the poverty rate of Indonesia to drop to 6.5-7% by 2024 and 4.4-5% by 2030.
big data, such as point-of-interest (POI) data, is another potential option to identify poverty geographically. Shi et al. [14] stated that POI density and POI cost distance can reflect the convenience of human survival and production supporting the degree of regional socio-economic development, which is closely related to poverty.
Several studies have identified poverty from various geographic aspects through remote sensing and big data. Duque et al. [44] used high-resolution satellite imagery to produce an intra-urban poverty index using land-cover data. They found that the built-up index can explain up to 59% of the variability in the survey-based slum index. Principal component analysis (PCA) has also been used to summarize the spectral information obtained [45]. Niu et al. [46] developed the multi-source data poverty index (MDPI) based on the characteristics of the built environment obtained from remote sensing satellite imagery, consisting of land cover composition (NDVI, NDBI, and NDWI), NTL, and social conditions obtained from housing rent data. From this study, it was found that there is a high consistency between the MDPI and the official poverty measurement. Shi et al. [14] also developed a poverty index based on remote-sensing satellite imagery and geospatial big data, the comprehensive poverty index (CPI). The CPI was developed by combining NTL data, the digital elevation model (DEM), NDVI, and POI data to map poverty at 500-m spatial resolution. The results suggest that CPI provides a powerful way of identifying poverty distribution.
Multiple sources of remote sensing and geospatial big data have promising potential for capturing the geography of poverty (GOP), which studies poverty from a geographical point of view [47]. To the best of our knowledge, investigation regarding a study identifying the relative spatial poverty of Indonesia that integrates the potential of remote sensing and geospatial big data is still limited, especially in terms of using economic spatial concentration (NTL), infrastructure distribution (BUI), land cover (NDVI and NDWI), air pollution (CO, NO 2 , and SO 2 ), and accessibility (POI) as combined poverty proxy indicators. However, the availability of this information can support the limitations of the official poverty data. To answer this problem based on the existing potential, we proposed a new approach to map poverty, with better granularity in scope and at a lower cost. In particular, this study aims to: (1) calculate the relative spatial poverty index (RSPI) based on multisource remote sensing and geospatial big data that represent poverty in the case study area, using a median aggregation method at 1.5 km grid level; (2) provide a 1.5-km spatial resolution RSPI poverty map; and (3) validate the obtained RSPI through numerical and descriptive approaches. We selected East Java, Indonesia as a case study area and focused our research on poverty in 2020 in East Java. In 2020, East Java was the province with the largest number of poor people in Indonesia [48]. The obtained results are expected to provide a lower-cost granular spatial poverty map that needs less time to update, to support the existing official poverty data. Thus, policy decisions are expected to be more effective and efficient as a consequence, so that the poverty reduction target can be achieved.

Study Area
East Java is one of 34 provinces in Indonesia with Kota Surabaya as the capital province. East Java consists of 38 regencies/municipalities. The percentage of East Java in poverty in 2020 has reached 11.09, or approximately 4,419,100 people who are considered to be living in poverty [48]. Figure 1 shows the map of East Java as the case study, along with the distribution of official poverty data at the regency/municipality level in 2020. Official poverty data were calculated, based on a monetary expenditure approach, through the Indonesia National Socio-Economic Survey, SUSENAS [48]. Official poverty data were calculated, based on a monetary expenditure approach, through the Indonesia National Socio-Economic Survey, SUSENAS [48]. The poverty of East Java is concentrated in the regencies on Madura Island, the northeast part of the East Java province. Madura Island is lacking in fertile areas and is in the form of a plateau without volcanoes and dry agricultural land [49]. In 2020, the poverty of Sampang Regency was the highest in East Java, reaching 22.78%, followed by Bangkalan (20.56%), and Sumenep (20.18%) [48]. Kota Batu (3.89%) and Kota Malang (4.44%) are the municipalities with the lowest poverty rates in East Java. Both municipalities are well-developed areas with fertile plateaus and tourism areas [50,51]. Kota Surabaya, the largest metropolitan city in East Java and the second-largest in Indonesia, has the fourth-lowest poverty rate, with a value of 5.02% [48,52]. Physiographically, the southern region of East Java is a plateau area with volcanoes in the middle, while in the northwest, there are limestone mountains that are relatively barren.

Data Used in This Study
To identify spatial poverty in East Java, there are two types of geospatial data used in this study. The first type uses remote sensing raster images obtained from multisource satellite images; we used night-time light intensity (NTL) images from NOAA-VIIRS, the normalized difference vegetation index (NDVI), built-up index (BUI), and normalized difference water index (NDWI) from Sentinel-2, land surface temperature (LST) from MODIS, and carbon monoxide (CO), nitrogen dioxide (NO2), and sulfur dioxide (SO2) from Sentinel-5P. We selected currently publicly available remote-sensing data sources with the highest resolution. Remote-sensing satellite imagery data were collected and The poverty of East Java is concentrated in the regencies on Madura Island, the northeast part of the East Java province. Madura Island is lacking in fertile areas and is in the form of a plateau without volcanoes and dry agricultural land [49]. In 2020, the poverty of Sampang Regency was the highest in East Java, reaching 22.78%, followed by Bangkalan (20.56%), and Sumenep (20.18%) [48]. Kota Batu (3.89%) and Kota Malang (4.44%) are the municipalities with the lowest poverty rates in East Java. Both municipalities are welldeveloped areas with fertile plateaus and tourism areas [50,51]. Kota Surabaya, the largest metropolitan city in East Java and the second-largest in Indonesia, has the fourth-lowest poverty rate, with a value of 5.02% [48,52]. Physiographically, the southern region of East Java is a plateau area with volcanoes in the middle, while in the northwest, there are limestone mountains that are relatively barren.

Data Used in This Study
To identify spatial poverty in East Java, there are two types of geospatial data used in this study. The first type uses remote sensing raster images obtained from multisource satellite images; we used night-time light intensity (NTL) images from NOAA-VIIRS, the normalized difference vegetation index (NDVI), built-up index (BUI), and normalized difference water index (NDWI) from Sentinel-2, land surface temperature (LST) from MODIS, and carbon monoxide (CO), nitrogen dioxide (NO 2 ), and sulfur dioxide (SO 2 ) from Sentinel-5P. We selected currently publicly available remote-sensing data sources with the highest resolution. Remote-sensing satellite imagery data were collected and preprocessed through the Google Earth Engine, a cloud-based platform designed to store and process earth data. The Google Earth Engine supported the processing of the satellite image data into classification tasks [53][54][55]. The area and time scope of the data collection is East Java Province, Indonesia from 1 January 2020 to 31 December 2020.
The second type used is geospatial big data. We used point-of-interest (POI) data obtained from the Open StreetMap. A huge number of point locations of important places are included in the POI data, with 1 July 2020 as the time reference. We filtered several POI categories that are related to poverty according to previous studies [56,57]. Generally, four POI categories were included in this study: education, health, finance, and tourism. There were more than 13,000 points collected, including those representing a hotel, restaurant, hospital, tourist attraction, post office, cafe, school, theatre, mall, university, etc. This data describes the accessibility of an area. The data used in this study are presented systematically in Table 1. From Table 1, we can see that the highest satellite spatial resolution is Sentinel-2 (10 m) and the lowest is Sentinel-5P (1.1 km). To accommodate the lowest resolution, we chose a 1.5 km spatial resolution poverty map as the output of this study. The detailed data pre-processing and calculation of the variables are described in Section 2.3. For validation purposes, we used the official poverty data at the regency/municipality level, as published by BPS. The data was obtained through the 2020 Indonesia National Socio-Economic Survey, SUSENAS. In this data, poverty is seen as an economic inability to meet basic food and non-food needs, as measured from the expenditure approach. A population is categorized as poor if their average monthly per capita expenditure on basic food and non-food needs is below the poverty line [48].

Methodology
In this study, we built a relative spatial poverty index (RSPI) for additional information that describes poverty with better granularity in scope and lower cost, with less time to update. RSPI is supposed to enhance the limitations of the existing household survey-based poverty data collection methods. The research framework of this study is schematically illustrated in Figure 2. This study started by collecting and pre-processing data, transforming data, integrating data, performing correlation analysis, and selecting variables, calculating RSPI, then validating and interpreting the result. In performing our analysis and visualization, we utilized Python 3.6.9 and QGIS 3.10.4. The expected output was the 1.5 km × 1.5 km RSPI spatial poverty map and its validation. Further detailed explanations are outlined below. For validation purposes, we used the official poverty data at the regency/municipality level, as published by BPS. The data was obtained through the 2020 Indonesia National Socio-Economic Survey, SUSENAS. In this data, poverty is seen as an economic inability to meet basic food and non-food needs, as measured from the expenditure approach. A population is categorized as poor if their average monthly per capita expenditure on basic food and non-food needs is below the poverty line [48].

Methodology
In this study, we built a relative spatial poverty index (RSPI) for additional information that describes poverty with better granularity in scope and lower cost, with less time to update. RSPI is supposed to enhance the limitations of the existing household survey-based poverty data collection methods. The research framework of this study is schematically illustrated in Figure 2. This study started by collecting and pre-processing data, transforming data, integrating data, performing correlation analysis, and selecting variables, calculating RSPI, then validating and interpreting the result. In performing our analysis and visualization, we utilized Python 3.6.9 and QGIS 3.10.4. The expected output was the 1.5 km × 1.5 km RSPI spatial poverty map and its validation. Further detailed explanations are outlined below.

Data Collection and Pre-Processing
The data collected from the sources that have been described were then preprocessed. Pre-processing data is one of the most important tasks for this index, including preparing the data and converting it to the proper format [70,71]. In this study, data preprocessing mostly aims to clean and improve the quality of the data to be analyzed. Data

Data Collection and Pre-Processing
The data collected from the sources that have been described were then pre-processed. Pre-processing data is one of the most important tasks for this index, including preparing the data and converting it to the proper format [70,71]. In this study, data pre-processing mostly aims to clean and improve the quality of the data to be analyzed. Data preprocessing that is performed for remote sensing satellite imagery is different from geospatial big data; the details are explained as follows.
Remote Sensing Satellite Imagery Data Pre-Processing The remote-sensing satellite imagery data used in this study is a collection of the images from 1 January 2020 to 31 December 2020. The obtained images are then preprocessed through four stages: cloud selection, cloud masking, median reducing, and band compositing. Cloud selection and cloud masking of Sentinel-2 and MODIS satellite imagery is performed, based on the quality assessment described in Table 2. The collected NTL data are the composite image that has been corrected for cloud cover using the VIIRS Cloud Mask (VCM) product. Median reducing is then performed to obtain one value for each observation that represents the satellite images from one year. To get NDVI, BUI, and NDWI values from Sentinel-2 images, band compositing is conducted using the following formulas [62][63][64]72]: The obtained remote sensing satellite imagery data are shown in Figure 3.

Geospatial Big Data Pre-Processing
The point-of-interest (POI) data used in this study are in the form of vector data that contain points. The main pre-processing method performed on POI data is calculating POI density and POI distance. To calculate the relative spatial poverty index (RSPI) with 1.5 km × 1.5 km spatial resolution, the calculation of POI density and POI distance are done to fill the value on each grid. POI density is defined as the number of points in a 1.5 km × 1.5 km grid. POI distance is defined as the minimum distance from 1.5 km × 1.5 km grid center to the nearest POI calculated using the Euclidean distance approach. We calculate POI distance in meter units. The obtained remote sensing satellite imagery data are shown in Figure 4.  density and POI distance. To calculate the relative spatial poverty index (RSPI) with 1.5 km × 1.5 km spatial resolution, the calculation of POI density and POI distance are done to fill the value on each grid. POI density is defined as the number of points in a 1.5 km × 1.5 km grid. POI distance is defined as the minimum distance from 1.5 km × 1.5 km grid center to the nearest POI calculated using the Euclidean distance approach. We calculate POI distance in meter units. The obtained remote sensing satellite imagery data are shown in Figure 4.

Data Transformation
There are two main objectives of data transformation application in this study, firstly, getting values with a similar range with the aim of no variable dominating other variables, and secondly, accommodating heteroscedasticity in the data so that better analysis could be obtained. We applied the Yeo-Johnson power transformation to achieve the determined objectives. Yeo-Johnson transformation is a form of Box-Cox transformation that can deal with both positive and negative data; this transformation can be applied to handle the variability of variables that are unequal across the range by making it more Gaussian-like [73]. This data transformation is defined as follows: where is the input data and is the parameter.

Data Integration
To calculate the relative spatial poverty index (RSPI) with 1.5 km × 1.5 km spatial resolution, we applied aggregation to take the median value of the raster satellite imagery data, based on the 1.5 km × 1.5 km grid shapefile, and integrate it with geospatial big data. Besides this, for correlation analysis purposes, we applied zonal statistics to convert raster-based values into administrative-based values by ascertaining the median to make them comparable to the official administrative-based poverty data. The expected final

Data Transformation
There are two main objectives of data transformation application in this study, firstly, getting values with a similar range with the aim of no variable dominating other variables, and secondly, accommodating heteroscedasticity in the data so that better analysis could be obtained. We applied the Yeo-Johnson power transformation to achieve the determined objectives. Yeo-Johnson transformation is a form of Box-Cox transformation that can deal with both positive and negative data; this transformation can be applied to handle the variability of variables that are unequal across the range by making it more Gaussianlike [73]. This data transformation is defined as follows: where x is the input data and λ is the parameter.

Data Integration
To calculate the relative spatial poverty index (RSPI) with 1.5 km × 1.5 km spatial resolution, we applied aggregation to take the median value of the raster satellite imagery data, based on the 1.5 km × 1.5 km grid shapefile, and integrate it with geospatial big data. Besides this, for correlation analysis purposes, we applied zonal statistics to convert raster-based values into administrative-based values by ascertaining the median to make them comparable to the official administrative-based poverty data. The expected final output from this step is a 1.5 km × 1.5 km East Java vector grid and administrative-based vector, with the defined attributes shown in Table 1.

Correlation Analysis and Variable Selection
In this study, correlation analysis was conducted to determine the relationship between each geospatial variable defined in Table 1 and East Java's official poverty data. We intended to find the relationship at the 1.5 km × 1.5 km grid-level but, due to the limitation of official poverty data, we could only measure it at the regency/municipality administrative level. Both Pearson and Spearman correlation analyses were conducted to determine the relationship. The equation below shows the formula to establish the Pearson correlation coefficient (r): where r xy represents the correlation between x i as the first feature, y i is the second feature, and n is the number of observations. The Spearman correlation is calculated in the same way by changing the observation value to its ranking value. The correlation coefficient (r) ranges from 0 to 1. The direction of the relationship is indicated by a positive or negative sign. Table 3 shows the guidelines for interpreting the results of the correlation coefficient (r) according to Sugiyono [74]. The correlation significance test is then carried out to determine whether the correlation coefficient obtained was statistically significant at the α (significance level) of 0.05. The defined null hypothesis is that there is no correlation between the two variables, while the alternative hypothesis is defined as assuming that there is a correlation between the two variables. Although several studies have shown that geospatial variables are related to poverty, Wang et al. [75] stated that regional poverty may be varied in terms of spatial differences. From this correlation analysis, we selected variables that are statistically significant and correlated to the East Java official poverty data, according to the hypothesis testing. The selected variables should also be moderately, strongly, or very strongly (|r| ≥ 0.4) correlated with the East Java official poverty data, according to its correlation coefficient measure. Therefore, the RSPI is built based on variables that linearly represent poverty in East Java.

Relative Spatial Poverty Index (RSPI) Calculation
The relative spatial poverty index (RSPI) is calculated by overlaying selected geospatial variables that represent poverty in East Java. In order to overlay the variables, we implemented a weighted sum model. Several previous studies have implemented this method for constructing a geospatial index [76][77][78]. The following formula shows the application of the weighted sum model for RSPI construction.
where p is the number of overlaid variables used, w is the assigned weight, and x is the observed value. We used two approaches when performing the weight calculation. First, we established the correlation-based weight; correlation coefficient information was used as the weights. Variables with higher correlations are assumed to represent poverty better. Second, we used the PCA-based weight. These weights are calculated via the first principal component, established using the principal component analysis (PCA) method. Several studies have used PCA as an approach to calculating the socioeconomic index. Uddin et al. [79] used PCA for mapping the socio-economic vulnerability of the coastal region. It was found that PCA is a very useful method for identifying vulnerable areas in the coastal region of Bangladesh. Cartone and Postiglione [80] used PCA to build a spatial deprivation index.

Validation Assessment
Validation assessment is an important way to establish how far RSPI could describe poverty in East Java. We used two validation assessment approaches. The first method, numerical evaluation, is an evaluation to measure the similarity between the obtained result and the available ground-truth data numerically. In this evaluation, we calculated the Pearson and Spearman correlation, root mean square error (RMSE), and R 2 . The following formulas show the calculation of RMSE and R 2 : where y i is the true value,ŷ i is the predicted value, y i is the mean of true value, and n is the number of observations. It is not possible to evaluate each pixel due to the limitations of ground-truth data availability. Therefore, we aggregated the pixel values for each regency/municipality by calculating the mean to create administrative-based data for comparison. Second, in terms of descriptive evaluation, we visually compared the obtained result with the ground-truth data. Several previous studies have shown that ground-truth identification through high-resolution imagery can offer an evaluation option that cannot be performed numerically on each pixel. For example, Varshney et al. [81] estimated the percentage of roof material through Google Earth images, and Shi et al. [14] randomly chose six points to be identified via high-resolution satellite imagery to recognize poverty through the slope of the land. In this study, we randomly picked six 1.5 km × 1.5 km pixels and identified their geographic characteristic areas with the high-resolution Google Earth satellite.

Correlation Model Development
In this study, correlation analysis was carried out to determine the closeness and direction of the relationship between each geospatial variable and official poverty data. Correlation analysis is carried out after the data has been pre-processed and transformed. Due to the limitations of official poverty data that are only available at the regency/municipality level, we aggregated the pixel-sized geospatial variable data by taking the median value for each regency/municipality. Therefore, 38 observations were obtained for each geospatial variable. Figures 5 and 6 show the visualization of the geospatial variable maps, along with the official poverty maps, to compare them at the regency/municipality level. It can be seen that each geospatial variable describes a different spatial pattern for each region. The NTL variable, which is a proxy for economic activity, shows that high scores tend to be scattered in municipalities or urban areas with low poverty rates. The variables obtained from the Sentinel-2 satellite imagery (NDVI, BUI, and NDWI), which represent land cover, indicate that there is a homogeneous pattern in the southern regions of East Java, which comprises areas with middle-low poverty rates. ISPRS Int. J. Geo-Inf. 2022, 11, x FOR PEER REVIEW 13 of 30

Relative Spatial Poverty Index Calculation
In order to meet the requirement of providing the poverty identification with better granularity in scope and taking the least cost and time to update, and to enhance the limitations of the existing household-based poverty data collection, a relative spatial poverty index (RSPI) is proposed by this study. By applying a weighted sum overlay, we calculated the RSPI based on variables that were significantly correlated or were at least moderately correlated with East Java official poverty data: NTL, BUI, SO2, POI density, and POI distance. Two weight calculation approaches were used in this study: correlation- Figure 6. Transformed geospatial variables: POI density (points), and POI distance (meter) and scaled official poverty data at the regency/municipality level.
LST variable values tend to be high in municipalities or urban areas with a low poverty rate. Each variable of air pollution (CO, NO 2 , and SO 2 ) obtained from Sentinel-5P satellite imagery shows different patterns. CO variable values are high in the northwest of East Java, which is an area with middle-high poverty rates, NO 2 variable values are high in industrial areas with different poverty rates, while SO 2 variable values tend to be high in the central area, which is a densely populated mountainous area with relatively a low poverty rate. Accessibility variables (POI density and POI distance) capture the pattern of municipalities or urban areas with high accessibility and low poverty rates.
To achieve a better understanding of the relationship between geospatial variables and the official poverty at the regency/municipality level, we applied correlation analysis. Through this analysis, we acquired p-values to show the statistically significant correlated variables and correlation coefficients that represent the closeness and direction of the relationship. The correlation coefficient was calculated using two approaches, namely, the Pearson and Spearman rank correlations. Table 4 shows the obtained correlation analysis results. Interpretation of the closeness and direction of the relationship was performed, following the method used by the authors of [74]. As we can see in Table 4, the variables with a positive direction relationship are NDVI, NDWI, and POI distance. This shows that the increment value of these variables will be in line with the increment of the official poverty rate value. Conversely, the variables with a negative direction relationship are NTL BUI, LST, CO, NO 2 , SO 2 , and POI density. This shows that the increment value of these variables will be in line with the decrement of the official poverty rate value.
From Table 4, it can be seen that there are five variables that are statistically significant when correlated to the official poverty rate: NTL, BUI, SO 2 , POI density, and POI distance. The SO 2 , POI density, and POI distance variables are strongly correlated with the official poverty rate, while NTL and BUI are moderately correlated, and the rest are weakly or very weakly correlated. To ensure that the relative spatial poverty index (RSPI) can be used to represent poverty in East Java, we chose variables that are significantly correlated (p-value < 0.05, number of observations = 38) and at least moderately correlated (|r| ≥ 0.4) to the official poverty rate. Therefore, five variables were selected to calculate the East Java RSPI: NTL, BUI, SO 2 , POI density, and POI distance.

Relative Spatial Poverty Index Calculation
In order to meet the requirement of providing the poverty identification with better granularity in scope and taking the least cost and time to update, and to enhance the limitations of the existing household-based poverty data collection, a relative spatial poverty index (RSPI) is proposed by this study. By applying a weighted sum overlay, we calculated the RSPI based on variables that were significantly correlated or were at least moderately correlated with East Java official poverty data: NTL, BUI, SO 2 , POI density, and POI distance. Two weight calculation approaches were used in this study: correlation-based weight (W 1 ) and PCA based-weight (W 2 ). The correlation-based weight (W 1 ) was obtained based on the Pearson correlation coefficient. The PCA-based weight (W 2 ) was obtained through the first principal component of PCA. Table 5 shows the derived weight calculations. Of the two types of weighting approaches, RSPI 1 is calculated using W 1 weights and RSPI 2 is calculated using W 2 weights. In this study, both RSPI 1 and RSPI 2 are calculated on a 1.5 km × 1.5 km spatial resolution grid to achieve a poverty map, as illustrated in Figures 7 and 8. To simplify the interpretation, we present the min-maxscaled relative spatial poverty map, so that the displayed values are in the range of 0-1. From Figures 7 and 8, it can be seen that the poverty maps generated by RSPI 1 and RSPI 2 have given similar results; low values tend to be concentrated in the central part of East Java. The southern and northwest parts of East Java, along with Madura Island, tend to have relatively high values. To assess this similarity, the correlation coefficient was calculated; the obtained Pearson correlation coefficient is 0.99 and the obtained Spearman rank correlation is 0.98. This shows that RSPI 1 and RSPI 2 are strongly correlated.
Of the two types of weighting approaches, 1 is calculated using 1 weights and 2 is calculated using 2 weights. In this study, both 1 and 2 are calculated on a 1.5 km × 1.5 km spatial resolution grid to achieve a poverty map, as illustrated in Figures 7 and 8. To simplify the interpretation, we present the min-maxscaled relative spatial poverty map, so that the displayed values are in the range of 0-1. From Figures 7 and 8, it can be seen that the poverty maps generated by RSPI1 and RSPI2 have given similar results; low values tend to be concentrated in the central part of East Java. The southern and northwest parts of East Java, along with Madura Island, tend to have relatively high values. To assess this similarity, the correlation coefficient was calculated; the obtained Pearson correlation coefficient is 0.99 and the obtained Spearman rank correlation is 0.98. This shows that RSPI1 and RSPI2 are strongly correlated.

RSPI Numerical Evaluation
The obtained RSPI was then validated with two approaches, namely, numerical

RSPI Numerical Evaluation
The obtained RSPI was then validated with two approaches, namely, numerical evaluation and descriptive evaluation. In the numerical evaluation, we focused on calculating how close the RSPI values were, numerically, to the official poverty data. The descriptive evaluation will be discussed in the next section. Due to the limitations of the official poverty data, which is only available up to the regency/municipality level, it is not possible to evaluate each pixel. Therefore, we aggregated the obtained RSPI pixel values for each regency/municipality by taking the mean value; 38 RSPI values were obtained. Numerical evaluation has been performed using two approaches: correlation analysis and RMSE calculation. Table 6 shows the obtained correlation analysis. From Table 6, we can see that each RSPI is statistically significant (p-value < 0.05) correlated to the official poverty data. The Pearson and Spearman rank correlation coefficient shows that RSPI 1 and RSPI 2 are strongly positively correlated to the official poverty data. The highest correlation coefficient was obtained by RSPI 1 (Pearson correlation coefficient = 0.71 (p-value = 5.97 × 10 −7 ) and the Spearman rank correlation coefficient = 0.77 (p-value = 1.58 × 10 −8 ) which is calculated using correlation-based weight. The positive direction indicates that the increment value of RSPI variables tends to be in line with the increment percentage of official poverty data. We also built a simple linear regression model, with RSPI as the independent variable and poverty rate data as the dependent variable. Table 7 shows the obtained model, along with its RMSE and R 2 value for each RSPI. As we can see, the model built by RSPI 1 has the lowest RMSE value and the highest R 2 value. Therefore, we can conclude that RSPI 1 is the best index to numerically predict the official poverty data. Figure 9 shows the prediction of the official poverty rate, based on RSPI 1 and RSPI 2 , and the official poverty rate. It can be seen that the distribution of predictions that are closest to the official poverty is the prediction based on RSPI 1 . Hence, we choose RSPI 1 as the most representative index of the official poverty data. Figure 9 shows the regression plot of the official poverty rate based on RSPI 1 and RSPI 2 .
value. Therefore, we can conclude that RSPI1 is the best index to numerically predic the official poverty data. Figure 9 shows the prediction of the official poverty rate, based on RSPI1 and RSPI2, and the official poverty rate. It can be seen that the distribution o predictions that are closest to the official poverty is the prediction based on RSPI1. Hence we choose RSPI1 as the most representative index of the official poverty data. Figure 9 shows the regression plot of the official poverty rate based on RSPI1 and RSPI2.  Table 7.

RSPI Ground Truth Analysis
Many previous studies have shown that ground truth identification through high resolution imagery can offer an evaluation option that cannot be performed numerically on each pixel. In this study, we randomly picked six 1.5 km × 1.5 km pixels and identified their geographic characteristic areas through Google Earth images. The selected RSPI plo is RSPI1, which is the most representative index of the official poverty data. Figure 10 shows the resulting RSPI ground-truth check. It can be seen that areas with high RSP scores tend to be sparsely populated areas with inadequate accessibility. This area tend to be a spatially deprived area with limited accessibility. On the other hand, areas with low RSPI scores tend to be densely populated areas that have better adequate accessibility Urban areas tend to have low RSPI values.  Table 7.

RSPI Ground Truth Analysis
Many previous studies have shown that ground truth identification through highresolution imagery can offer an evaluation option that cannot be performed numerically on each pixel. In this study, we randomly picked six 1.5 km × 1.5 km pixels and identified their geographic characteristic areas through Google Earth images. The selected RSPI plot is RSPI 1 , which is the most representative index of the official poverty data. Figure 10 shows the resulting RSPI ground-truth check. It can be seen that areas with high RSPI scores tend to be sparsely populated areas with inadequate accessibility. This area tends to be a spatially deprived area with limited accessibility. On the other hand, areas with low RSPI scores tend to be densely populated areas that have better adequate accessibility. Urban areas tend to have low RSPI values.

Comparison between the Obtained RSPI and the Official Poverty Data
Although the calculated relative spatial poverty index (RSPI) maps poverty from the geographical point of view, we were interested in descriptively comparing the obtained result with the official poverty data, which was calculated using the expenditure approach. Figure 11 shows the comparison between the aggregated regency/municipality-

Comparison between the Obtained RSPI and the Official Poverty Data
Although the calculated relative spatial poverty index (RSPI) maps poverty from the geographical point of view, we were interested in descriptively comparing the obtained result with the official poverty data, which was calculated using the expenditure approach. Figure 11 shows the comparison between the aggregated regency/municipalitylevel RSPI (Figure 11a,b) with the official poverty data (Figure 11c). Aggregation is performed due to the limitations of the official poverty data, which is only available up to the regency/municipality level. It can be seen that RSPI 1 and RSPI 2 present similar results. The areas with the lowest spatial poverty are Kota Surabaya, Kota Malang, and the other urban areas. These areas tend to be non-monetary-deprived areas according to the official poverty data. The southern part of East Java tends to be non-monetary-deprived areas, but it does tend to have high RSPI scores. Therefore, it can be said that although these areas are less poor (according to the official poverty data), these areas still have the ability to be affected by spatial poverty traps or geographically deprived areas. Moreover, it can be seen that Madura Island's regencies, which are areas with high poverty rates according to the official poverty data, tend to have high RSPI scores. This indicates that, apart from monetary deprivation, these areas are also spatially deprived.

Limitations and Future Possible Directions
Poverty data are usually collected through on-the-ground household-based socioeconomic surveys. Unfortunately, data collection with this method is limited in scope, expensive, laborious, and time-consuming [8]. Indonesia's official poverty data are obtained through the Indonesian National Socio-Economic Survey (SUSENAS), which is conducted every six months at the household level. However, data collection by surveys cannot be separated from the sampling error and non-sampling error [82]. Sampling error is obtained from the use of samples in estimating the population. No matter how many samples are used, there will always be a difference or error between the estimated and the Figure 11. Comparison of (a) RSPI 1 and (b) RSPI 2 (scaled index) and (c) official poverty data (scaled percentage) at regency/municipality level.

Limitations and Future Possible Directions
Poverty data are usually collected through on-the-ground household-based socioeconomic surveys. Unfortunately, data collection with this method is limited in scope, expensive, laborious, and time-consuming [8]. Indonesia's official poverty data are obtained through the Indonesian National Socio-Economic Survey (SUSENAS), which is conducted every six months at the household level. However, data collection by surveys cannot be separated from the sampling error and non-sampling error [82]. Sampling error is obtained from the use of samples in estimating the population. No matter how many samples are used, there will always be a difference or error between the estimated and the actual value [83]. The non-sampling error is obtained from the inaccuracy of the data collection. Sari et al. [84] revealed that the SUSENAS data collection is still inseparable from human errors, such as inaccuracies during data entry. In addition, the SUSENAS data are only available up to the regency/municipality level, even though decision-making requires more granular data [85]. On the other hand, remote sensing and big data offer datasets that are unlimited, free, easier to obtain and are representative of the population. However, the use of satellite imagery and big data, such as the relative spatial poverty index (RSPI), only captures poverty information from the geographical point of view, and there has never been an evaluation of how this data can describe the actual poverty, which is multidimensional. From a comparison of the advantages and disadvantages, it is possible to cover the existing weaknesses with the existing advantages via data integration. In this case, we can obtain monetary poverty information at the regency/municipality level through official poverty data and observe spatial poverty to an accuracy of 1.5 km × 1.5 km using RSPI.
The integration of the official poverty data and RSPI is also useful for describing poverty from another point of view. So far, the calculation of poverty in Indonesia still uses the expenditure approach. A population is categorized as poor if their average monthly per capita expenditure on basic food and non-food needs is below the poverty line [48]. In fact, poverty is a multidimensional problem that cannot only be seen from the point of view of income or expenditure [86,87]. Therefore, studies on measuring poverty from various approaches are still continuously conducted, for example, from the geographic approach. Areas with unique geographic characteristics play an important role in poverty mapping; poor people are found to be more likely to live in certain places [80,88]. The spatial poverty traps are situations where geographical capital, for example, the area's physical nature and socio-political conditions are low and poverty is high, as a result of geographic disadvantage [89]. Spatial poverty traps are areas where people are at higher risk of being trapped in poverty rather than elsewhere, due to spatial disadvantages [90]. In the case of identifying poverty traps, RSPI can be a better approach because it is built from various geographic variables.
To sum up, RSPI has the ability to capture spatial poverty at a 1.5-km grid level and also has the possibility to be updated every month. The RSPI datasets were obtained from free and publicly available data sources, which are faster to access and require fewer human resources. Nevertheless, RSPI can only describe poverty from a geographical point of view. Moreover, further and more accurate validation up to a 1.5-km grid level is still needed to evaluate how far the ability of RSPI can go in describing poverty, which is a multidimensional problem. Therefore, we suggest the application of data integration between RSPI and official poverty data for better poverty identification and monitoring.
In addition, it is possible for RSPI calculations to be applied to other regions of Indonesia or even to other countries by paying attention to the spatial characteristics that characterize poverty in a particular area. This is in line with the research of Wang et al. [75], which states that regional poverty can vary according to spatial differences. To ensure that certain geospatial variables can be used to describe poverty in certain areas, further analysis is needed regarding the relationship between the geospatial variables used and poverty in the specific geospatial area.

Conclusions
To enhance the limitations of the existing household survey-based poverty data collection, this study provides a relative spatial poverty index (RSPI) with better granularity in scope, which is less costly and takes less time to update. The RSPI calculations utilize the use of multisource remote sensing satellite imagery and geospatial big data. The RSPI in the case study area, East Java, Indonesia, is calculated based on geospatial variables that specifically represent poverty in East Java in 2020: night-time light intensity (NTL), the built-up index (BUI), sulfur dioxide (SO 2 ), point-of-interest (POI) density data, and POI distance data, which are statistically significant correlated or at least moderately correlated with the official poverty data. These variables are then overlaid by a weighted sum model using two weight calculation approaches: correlation-based weight and PCA-based weight. It was found that the use of multisource remote sensing and geospatial big data has good potential for representing poverty in East Java, Indonesia in 2020. This is evidenced by the strong correlation between the RSPI at the regency/municipality level and the official poverty data. The best RSPI for representing poverty in East Java 2020, calculated using a correlation-based weight-sum model, is strongly correlated with official poverty data, with a Pearson correlation coefficient of 0.71 (p-value = 5.97 × 10 −7 ) and a Spearman rank correlation coefficient of 0.77 (p-value = 1.59 × 10 −8 ). This RSPI is also quite promising for use as a predictor variable in the estimation of poverty data. We built a simple linear regression model, estimating the East Java, Indonesia 2020 official poverty rate with RSPI as the only predictor variable. The model obtained an RMSE of 3.18%, with an R 2 up to 0.50. RSPI is then presented in the form of a non-technical, user-friendly poverty map with a spatial resolution of 1.5 km × 1.5 km. The results of the descriptive evaluation of this map indicate that areas with high RSPI scores tend to be geographically deprived areas that are sparsely populated, with more inadequate accessibility; in contrast, areas with low RSPI scores tend to be densely populated areas that have better accessibility. Therefore, the ability of RSPI to map spatially deprived areas can be used to support the official poverty data.

Data Availability Statement:
The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request. poverty. A higher NTL value indicates higher economic activity in an area [60]. Through descriptive analysis, we observed several NTL values, to identify the corresponding geographic features and compare them to the official East Java poverty data. Figure A1 shows some of the NTL values and their ground-truth check. Using Figure A1, we will discuss some of the anomalies that exist in the use of NTL data in identifying poverty in East Java. First, we can see that high NTL scores are clustered in Kota Surabaya. Surabaya is the largest metropolitan city in Indonesia after Jakarta. However, official poverty data shows that Kota Batu, Kota Malang, and Kota Madiun are the three lowest-poverty regencies/municipalities in East Java. Second, we can also see that most of the largest NTL values are not in regencies/municipalities with low poverty. Areas with energy plants and factories that operate late into the night tend to have high NTL values. Third, high values of NTL were also found in plantations with all-night lighting which did not indicate poverty. These anomalies indicate that in some areas, for example, in East Java, the use of NTL values as a poverty approach needs to be supported by the use of other data for better identification. From Figure A1, we can also see that areas with low NTL values are usually the forests around mountains. Dawson et al. [32] revealed that the NDVI (normalized difference vegetation index) was significantly correlated with poverty (positive high and/or negative high) in both shrinking and growing countries. From Figure A2, we can see that the very low NDVI value is likely from a water-covered area, such as a pond; this value is usually lower than that of buildings. High NDVI values indicate a dense forest area. This is in line with a study by Maurya R. et al. [91], which stated that negative NDVI values represent nonvegetated areas such as water, while positive NDVI values indicate the vegetated ones. It can be seen that Kota Surabaya and Kota Malang, which are densely built-up areas, tend to have lower NDVI values, and poverty in these areas is also low. Dawson et al. [32] revealed that the NDVI (normalized difference vegetation index) was significantly correlated with poverty (positive high and/or negative high) in both shrinking and growing countries. From Figure A2, we can see that the very low NDVI value is likely from a water-covered area, such as a pond; this value is usually lower than that of buildings. High NDVI values indicate a dense forest area. This is in line with a study by Maurya R. et al. [91], which stated that negative NDVI values represent non-vegetated areas such as water, while positive NDVI values indicate the vegetated ones. It can be seen that Kota Surabaya and Kota Malang, which are densely built-up areas, tend to have lower NDVI values, and poverty in these areas is also low.
shrinking and growing countries. From Figure A2, we can see that the very low NDVI value is likely from a water-covered area, such as a pond; this value is usually lower than that of buildings. High NDVI values indicate a dense forest area. This is in line with a study by Maurya R. et al. [91], which stated that negative NDVI values represent nonvegetated areas such as water, while positive NDVI values indicate the vegetated ones. It can be seen that Kota Surabaya and Kota Malang, which are densely built-up areas, tend to have lower NDVI values, and poverty in these areas is also low. According to Lee et al. [63], the built-up index (BUI) is compatible with classifying urban and non-urban areas. The higher BUI value indicates the higher possibility of a pixel signifying a built-up area. However, in the case of heterogeneous East Java, a very high BUI value was obtained not only in urban built-up areas but also in lime mines and According to Lee et al. [63], the built-up index (BUI) is compatible with classifying urban and non-urban areas. The higher BUI value indicates the higher possibility of a pixel signifying a built-up area. However, in the case of heterogeneous East Java, a very high BUI value was obtained not only in urban built-up areas but also in lime mines and inland water, as we can see in Figure A3. Lime mining is mostly found in the Tuban and Bojonegoro regencies. Therefore, this area also had high BUI values, besides the urban areas. In fact, according to official poverty data, Tuban and Bojonegoro are regencies with high poverty. We can also see that those areas with low BUI values are usually forests and mountain craters. This is why regencies/municipalities with many mountains, such as Kediri, Malang, Lumajang, Jember, and Pasuruan, tend to have low BUI values. inland water, as we can see in Figure A3. Lime mining is mostly found in the Tuban and Bojonegoro regencies. Therefore, this area also had high BUI values, besides the urban areas. In fact, according to official poverty data, Tuban and Bojonegoro are regencies with high poverty. We can also see that those areas with low BUI values are usually forests and mountain craters. This is why regencies/municipalities with many mountains, such as Kediri, Malang, Lumajang, Jember, and Pasuruan, tend to have low BUI values. Figure A3. East Java BUI values (index) and their resulted ground-truth checking. Figure A4 shows the distribution of the normalized difference water index (NDWI). Theoretically, NDWI values above 0 indicate a water body imagery result; otherwise, values below 0 indicate a non-water imagery result [64]. As we can see, areas covered by water are most likely to have an NDWI value of more than zero, except for the Karangkates reservoir, which has an NDWI value below zero. Regencies/municipalities with many mountains, such as Kediri, Malang, Lumajang, Jember, and Pasuruan, tend to have high NDWI values.  Figure A4 shows the distribution of the normalized difference water index (NDWI). Theoretically, NDWI values above 0 indicate a water body imagery result; otherwise, values below 0 indicate a non-water imagery result [64]. As we can see, areas covered by water are most likely to have an NDWI value of more than zero, except for the Karangkates reservoir, which has an NDWI value below zero. Regencies/municipalities with many mountains, such as Kediri, Malang, Lumajang, Jember, and Pasuruan, tend to have high NDWI values. Figure A3. East Java BUI values (index) and their resulted ground-truth checking. Figure A4 shows the distribution of the normalized difference water index (NDWI). Theoretically, NDWI values above 0 indicate a water body imagery result; otherwise, values below 0 indicate a non-water imagery result [64]. As we can see, areas covered by water are most likely to have an NDWI value of more than zero, except for the Karangkates reservoir, which has an NDWI value below zero. Regencies/municipalities with many mountains, such as Kediri, Malang, Lumajang, Jember, and Pasuruan, tend to have high NDWI values. Figure A4. East Java NDWI values (index) and their resulting ground-truth checking. Figure A4. East Java NDWI values (index) and their resulting ground-truth checking.
Land surface temperature (LST) is often used to identify urban areas. In their research, which examines the correlation between LST and urban heat islands, Mia et al. [92] stated that the higher the LST value, the more extensive the urban heat island. However, as we can see in Figure A5, in the case of East Java, areas with high LST are not only urban areas but also arid areas and limestone mining areas. The northern part of East Java is an area with a great deal of arid and empty land and limestone mines, which tend to have a higher LST value than the southern area, which consists of mountains. We can see that regencies/municipalities with high poverty rates tend to have low LST values, except for Kota Surabaya and the regencies/municipalities around it. Land surface temperature (LST) is often used to identify urban areas. In their research, which examines the correlation between LST and urban heat islands, Mia et al. [92] stated that the higher the LST value, the more extensive the urban heat island. However, as we can see in Figure A5, in the case of East Java, areas with high LST are not only urban areas but also arid areas and limestone mining areas. The northern part of East Java is an area with a great deal of arid and empty land and limestone mines, which tend to have a higher LST value than the southern area, which consists of mountains. We can see that regencies/municipalities with high poverty rates tend to have low LST values, except for Kota Surabaya and the regencies/municipalities around it. Air pollution can be one indicator of economic activity that is related to poverty. Wu et al. [93] use CO2 (carbon dioxide) and SO2 (sulfur dioxide) to examine the povertyenvironmental trap. In this study, we examine CO (carbon monoxide), NO2 (nitrogen dioxide), and SO2 to capture air pollution in East Java. Figures A6-A8 show the East Java pollution indicator values and their ground-truth checks. Each pollution indicator's high value captures different information. From Figure A6, we can see that areas with high CO Air pollution can be one indicator of economic activity that is related to poverty. Wu et al. [93] use CO 2 (carbon dioxide) and SO 2 (sulfur dioxide) to examine the povertyenvironmental trap. In this study, we examine CO (carbon monoxide), NO 2 (nitrogen dioxide), and SO 2 to capture air pollution in East Java. Figures A6-A8 show the East Java pollution indicator values and their ground-truth checks. Each pollution indicator's high value captures different information. From Figure A6, we can see that areas with high CO values are industrially dense areas. This value exceeds the CO value of Kota Surabaya, which is the second-largest metropolitan city in Indonesia. From Figure A7, we can see that those areas with high NO 2 values are steam power plants and densely populated and industrial areas. From Figure A8, we can see that the highest SO 2 values we captured are in mountain slope areas; however, the values can be small in other areas. On the other hand, each pollution indicator's low value captures similar information. Areas with low CO, NO 2 , or SO 2 are usually mountain peaks.    Figure A7. East Java NO2 values (mol/m 2 ) and their resulting ground-truth checking. Figure A7. East Java NO 2 values (mol/m 2 ) and their resulting ground-truth checking.