Estimating District-Level Electricity Consumption Using Remotely Sensed Data in Eastern Economic Corridor, Thailand

: The intensive industrial development in special economic zones, such as Thailand’s Eastern Economic Corridor, increases energy consumption, leading to an imbalance of energy supply and a challenge for energy management. Electricity consumption at a local level is crucial for utility planners to manage and invest in the electrical grid. With this study, we propose an electricity consumption estimation model at the district level using machine learning with publicly available statistical data and built-up area (BU), area of lit (AL), and sum of light intensity (SL) data extracted from Landsat 8 and Suomi NPP satellite nighttime light images. The models created from three machine learning algorithms, which included Multiple Linear Regression (MR), Decision Tree (DT), and Support Vector Regression (SVR), were compared. The results show that (1) electricity consumption is highly correlated with SL, AL, and BU; and (2) the DT model demonstrated a better performance in predicting local electricity consumption when compared to MR and SVR with the lowest error rate and highest R 2 . The local government in developing countries with limited data and ﬁnancial resources can adopt the proposed approach to beneﬁt from utilizing commonly available remote sensing and statistical data with simple machine learning models such as DT (regression method) for sustainable electricity management.


Introduction
Nowadays, energy has become an essential support for economic growth and is a rapid indicator of human living standards. Electricity is one of the primary energy sources and has a solid relationship with the degree of country development and quality of life in modern society [1]. Energy consumption can be explained by many factors, including economic growth, non-agricultural GDP, population, and urbanization [2]. Consequently, many pieces of previous research have focused on how energy consumption relates to Human Development Index (HDI) at the national level in developed countries [3][4][5]. These studies aim to sustain the welfare of developed countries through energy demand and its footprint. At the local scale, electricity consumption plays a significant role for utility planners to forecast load and invest in the electrical grid. Therefore, estimating electricity consumption enhances the understanding of human activities in the region and helps authorities determine the appropriate electricity supply [6].
The electricity consumption can be estimated by combining socioeconomic indicators as well as population density, human mobility, and electronic devices per household to predict electricity consumption [6][7][8]. However, the in-depth electricity load curve data are often unavailable in developing countries, such as Thailand. In such a situation, other proxies such as nighttime light (NTL) data from satellite images and geo-informatic and socioeconomic data are widely applied to estimate electricity load and consumption at the local level. For example, NTL has been applied as a proxy for an electricity-related The EEC's electricity consumption is heterogeneous due to the differences in geographic and socioeconomic factors. Rayong and Chonburi are the top GPP generators in Thailand because of their early industrial and tourism development from the Eastern Seaboard policy in 1990. Conversely, Chachoengsao has different economic activities, mainly focusing on agricultural and cultural tourism with medium GPP. Therefore, the electricity consumption estimation at a medium scale (district level) is required to support local government decision making.
In 2015, Phamornchantaramast S. et al. forecasted electricity demand in Chonburi province based on classified land-use types from Landsat 5 and 7. They used the built-up area to estimate the electricity demand per square kilometer. Their result showed that a medium correlation of 0.47 existed between predicted electricity demand and actual electricity consumption [23]. The electricity consumption in the districts close to industrial and tourism development areas was underestimated. Using Thailand's EEC as a case study, this paper aimed to build a suitable estimation model for electricity consumption at the district level by employing and comparing three MLAs (MR, DT, and SVM). Due to the limitation of the electricity load and electricity consumption database within the study area, we extracted nighttime light and built-up area from the remotely sensed data. We used them along with general socioeconomic data, such as population, household, and Remote Sens. 2021, 13, 4654 4 of 21 population density, as predictors of electricity consumption for 30 districts in the chosen region, based on the number of available data points. The detailed methodology, including dataset, pre-processing, prediction models, and validation approaches, was introduced in the next section. The results were then presented, and the findings were discussed. In the end, a conclusion was provided.

Study Flowchart
The flowchart in Figure 2 presents the processing steps used in this study. First, the remote sensing data were organized through Google Earth Engine (GEE) platform and processed to extract the built-up area, annual average light intensity, and area of lit from Landsat 8 Operational Land Imager (OLI), Suomi NPP Visible Infrared Imaging Radiometer Suite (VIIRS) Day and Night Band (DNB), respectively. Second, the sum of light intensity (SL) using zonal statistic tools was calculated for each district through ArcGIS online. All datasets were then prepared for model building after cleansing and normalizing. The electricity consumption estimation models included MR, DT, and SVR were conducted in RapidMiner ® software. The feature selection and the optimization of model parameters were also controlled.  The National Statistic Organization (NSO) of Thailand provides the general statistic from country to subdistrict level. The population and household data at the district level within EEC regions were downloaded from their website. Pre-processing was then performed to calculate and round up the population density for each district. The number of households showed a gradual increase over time. The population growth is radically fluctuating because of the intensive mobility of both industry and tourism laborers in the EEC's industrial regions. The commuter population increased from 800,000 people in 2010 to 1,000,000 people in 2019, influencing urbanization and the energy consumption rate [24]. The population and households dramatically increased from 2017 to 2018 after implementing a special economic development policy in 2016 [25]. GPP has been regarded as a strong indicator of electricity consumption [21]. We then gathered the 2012 to 2018 Gross Provincial Product per Capita from the website of the Office of the National Economic and Social Development Council (NESDB), Thailand.

Remotely Sensed Data
This study used NTL data extracted from Visible Infrared Imaging Radiometer Suite Day-Night Band (VIIRS-DNB) products and Landsat 7 and 8 ETM images from 2012 to 2018. The time series data were processed on Google Earth Engine (GEE, https://earthengine. google.com accessed on 24 June 2021), a cloud-based computing platform that allows users to conduct geospatial data analysis by providing various computation, analyzation, and visualization. GEE becomes an effective and constructive tool with various satellite images and geospatial datasets [26,27]. GEE demonstrated high potential in the large-scale temporal urbanization analysis process. Pu et al. [27] utilized Landsat time series data and computation algorithms in GEE to map urban areas and achieved an accuracy of 0.8 to 0.9.
In this study, we adopted a method from previous research with processes based on the GEE platform. These include (1) extracting light intensity and area of lit from the annual mean composite of monthly VIIRS-DNB, (2) computing built-up indices from the annual median composition of cloud-masked Landsat 7 and 8 Surface Reflectance (SR) datasets, and (3) classifying the built-up areas from the derived indices using adaptive thresholding method (Otsu's threshold) [28]. The built-up area extraction accuracy of 2012 and 2016 was assessed using 500 random sampling points from the national land-use dataset and virtual ground truth from Landsat 7 and 8 images each year. The overall accuracy of the build-up areas for 2012 and 2016 were 78% and 85% with kappa index values at 74% and 82%, respectively. The Landsat 7 (2012) and Landsat 8 (2013-2018) dataset was requested from the GEE platform, filtering by year, processing cloud mask, and selecting band 1 to 7. Each year of Landsat data has been reduced by median (derived from a histogram) for pre-compute the Landsat indices. We then adopted the built-up area extraction method from the previous paper, using the Modified Built-up Index (MBUI), which employs Otsu's thresholding technique, and integrates with the nighttime light data [29]. The MBUI is calculated by Equation (1).
where: BUI = NDBI − NDVI, NDBI = Considering the histogram of MBUI that represents some bimodal pattern, we choose Otsu's method to classify the built-up area. Otsu's thresholding is a fast and straightforward method to determine an optimal threshold value from an image histogram. The optimal threshold is given when the between-class variance is maximized [30]. Otsu's thresholding has been used in several pieces of research [31][32][33]. The sample of the result is presented in Figure 3c. We employed GEE algorithms to request and export the monthly VIIRS-DNB dataset (2012-2018), which removes cloud cover using the VIIRS cloud mask product (VCM) and correct for stray light [34]. We calculated the annual average light intensity for each year, except the year 2012, which we calculated using the average light intensity from six-monthly datasets (June to December 2012) based on data availability. Then, we adopted the zonal statistic tool in ArcGIS Pro to identify the sum of light intensity in each district polygon from light intensity images. Zonal Statistics is a tool to calculate each zone defined by a zone dataset, based on values from another value raster dataset. A single output value (the sum of light intensity) was computed for every zone (district).
Furthermore, the lit area was extracted using the thresholding method to differentiate each district's dark and bright areas. The threshold value was calculated for each year from 2012 to 2018 from the annual mean composite of VIIRS dataset by applying Otsu's method. The threshold of the lit area was found to be between 19.25 to 20.93 nW cm −1 sr −1 . The samples of built-up area, sum of light intensity, and area of lit maps extracted from remotely sensed data are presented in Figure 3.

Data Preparation for Machine Learning Models
The collected social and economic databases contain unnecessary information, a wide range of missing values and noises, and inconsistencies for prediction models. Before conducting any statistical analysis, the cleansing process, such as eliminating irregularities and outlying data, is necessary to ensure meaningful data and the reliability of model outputs [35]. Our cleaning method employed RapidMiner ® Software by using an imputing operator to estimate missing values through learning models. Then, we normalized all variables using the statistical method (z-score). The equation is shown in Equation (2). where x is normalized x, µ is the mean, and σ equals the standard deviation. The z-score is a common normalization technique. It preserves the original distribution of the data and is less influenced by outliers. After the cleansing and normalizing processes were completed, the descriptive statistics of each variable were calculated (Table 1). Next, a data splitting strategy was applied to randomly split the available dataset into three groups for the ML tasks (training, validating, and testing) [36,37]. We split the data from the 2012-2018 dataset (100%: n = 210) into three sets: (1) training (60%: n = 126), (2) validating (20%: n = 42), and (3) testing (20%: n = 42). The training set was used for learning and estimating parameters of the model, while the validation set was used to evaluate the model and for model selection. Lastly, the testing set was a set of examples used to assess the model's predictive performance.

ML Model Setting
ML-based prediction models consist of four types of resources: dataset, training process, model, and prediction. The workflow was to use training data during the training process for model creation. This model with data input was used to establish the predictions. The activation functions were first employed for linear and nonlinear models to train an ML model, depending on the specific ML-based prediction models. Therefore, the ML was trained by using simple frameworks. The tweaking parameters as hyperparameter optimization were applied to improve the trained ML by adjusting model parameters to achieve the best results. In addition, for ML model construction, the feature selection method was necessary for identifying a subset of only relevant input variables to be used. The variable selection process was beneficial in reducing the overfitting and training time while improving accuracy [38]. In this study, we experimented based on two assumptions: (1) the electricity consumption prediction model is linear, the variables are of a normal distribution, and (2) there is non-linearity in the variables. Hence, we applied Multiple Regression (MR), Decision Tree (DT), and Support Vector Regression (SVR) to deal with linear and nonlinear functions. The above process was conducted in RapidMiner ® Software in five steps (see Appendix A Figure A1).
Step one: The basic processing is composed of three tasks, including (1) primary pre-processing data, which started from loading the dataset and performed some basic pre-processing tasks such as defining target columns, removing unnecessary columns, and unifying value types, and then filtering examples. (2) The next task was creating a training and validation set by splitting data into a 60:40 ratio for performance calculation. (3) Then, the essential feature engineering was applied to replace missing and infinite value. This also includes data encoding and removing columns with too many nominal values.
Step two: The automatic feature selection and modeling. The feature selection was automatically performed to arrange and optimize the number of predictor variables used by each model. This feature selection supports multi-objective feature selection and defends a balance correlation value between 1 and 0. The multi-objective evolutionary algorithm was operated to find the best feature sets to build models for the feature selection process. Each feature set was Pareto-optimal concerning complexity vs. model validation. The complexity was calculated based on the feature set where each feature in the set contributes complexity one. The selected feature sets were determined based on the optimal trade-offs between complexity and model errors. Afterward, we trained the models and applied automatic hyperparameter tuning (parameter optimization).
Step three: The transform validation and scoring data were performed, preparing data for validation and the scoring process in the next step.
Step four: Scoring and validation processed by applying the trained models on validation and scoring dataset. In this step, we applied automatic hyperparameter tuning, such as optimizing parameters with three k-folds validation, to estimate the statistical performance of a learning model. Then, we calculated MAE and RMSE to determine the performance of prediction models.
Step five: The final prediction models were created by training models with the same parameters on the combined training and validation dataset.

Multiple Regression (MR)
Multiple Regression is a statistical technique of linear regression that predicts the value of a response (dependent) variable based on two or more explanatory (independent) variables. It is one of the most commonly used statistical techniques and has been adopted in many research studies that require continuous output results [18]. The multiple regression equation was shown in the following form: where y denotes the total electricity consumption (TEC), and each x i is the feature selected from the statistical variables in Table 1, such as total population (TP), household (TH), sum of light (SL), etc. b i (i = 1, 2 . . . n) are the regression coefficients. c is the independent identically distributed normal error, representing the deviation of the predicted value from the observed value of TEC. The fundamental assumptions of MR include the linear relationship between TEC and the selected features (e.g., total population (TP), total household (TH), sum of light (SL)). The MR model parameters are often estimated with the training data using the ordinary least squares approach by minimizing the sum of differences between predicted and observed values. After the model is built, it can be validated and assessed with the testing dataset to measure the accuracy of the prediction models. The coefficient of determination (R 2 ) is often used to measure how much of the variation in the dependent variable can be explained by independent variables. In this study, we inputted the normalized variable to RapidMiner ® Software. Then, the automatic feature selection function was applied to select the optimized prediction variables for the best fitting model.

Decision Tree (DT)
A Decision Tree is a non-parametric supervised learning method that repeatedly divides a set of training data into smaller subsets based on tests of one or more variables to create a prediction model for the target value [39]. Unlike many other statistical approaches, the DT (regression tree) does not have the requirements of value distribution or the independence of the variables from one another [40]. The model is trained by learning based on decision rules from features. The collection of nodes intends to decide on values connected to estimate a numerical target value (regression approach). The selected variable will optimally separate to reduce the error following the criterion. The new nodes are repeatedly built until the estimated numerical value meets the target (criteria).
The sampling of new nodes is repeated until the stopping criteria are met. A prediction for the class label "attribute" is determined depending on the majority of "examples". The criteria reach this leaf during generation, while an estimation for a numerical value is obtained by averaging the values in a leaf.
Our DT model was built by determining optimal parameters using threefold crossvalidation. We then trained by setting the splitting criterion as Least Square (LS) error, which minimizes the squared distance between the average values in the node regarding the actual value. The LS is one of the standard methods to build a regression tree. It can improve the computational efficiency result that deals with the nominal and numerical variable tree [41]. The least square error criterion can be expressed as follows: where n is the sample size, x i is a data point, and r(β, x i ) is a regression model for each pair of (x i , y i ). The model criterion was set for 20 maximum depths. We also applied autoprepruning to control the parameters of minimal gain = 0.05, minimal leaf size = 2, minimal size for split = 4, and a number of prepruning alternatives = 3 were used as stopping criteria.

Support Vector Regression (SVR)
A Support Vector Regression (SVR) is a non-parameter supervised machine learning model. The SVR function is derived based on training data to predict numerical values that find the best fitting line in the hyperplane with the maximum number of points [42]. The training attributes are defined by feature vectors x selected from the set of all features X (as in Table 1) and numerical label y ∈ R, which represents the prediction of TEC [43]. SVR is a regression function that is linear to the kernel function K(x i , x), trying to capture the non-linear relationship between selected features. Specifically, the prediction function is expressed as: where x denotes TEC, x i represents the prediction variables in Table 1 (such as TP, AL, SL, Bu, and GDP), α i and α * i are the Lagrange multipliers, and b is an intercept. This kernel function K(x i , x) was set up to map the nonlinear function to the flattest function in a feature space. We also applied the polynomial kernel, which is suitable for the normalized training dataset. The polynomial kernel is defined by: where d is the polynomial degree, and the kernel degree parameter specifies it. We input the normalized variables into RapidMiner ® Software for this study following five steps of building the prediction model. We applied the radial kernel with gamma = 0.001 and cache = 200. The hyperparameter C, which controls the relaxation of the error minimization problem, was set as 100. The convergence epsilon was set at 0.001 with a max iteration at 100,000. SVR constructed a hyperplane or set of hyperplanes in a high-or infinitedimensional space, which can be used for regression prediction with a fast algorithm and sufficient results.

Model Evaluation and Validation
To determine the accuracy and appropriate selection of the prediction method, we used two statistical measures, (1) mean absolute error (MAE) and (2) root mean square error (RMSE). A mean absolute error (MAE) measures the average magnitude of the errors in a set of predictions without considering their direction and the average of absolute differences between prediction and actual observation over the test samples with equal weight on all individual differences (Equation (7)) [19]. where n denotes the number of samples. |ŷ i − y i | are the absolute residuals between the actual values and the predicted values, where y i is the observed value for the observation andŷ i is the predicted value. This research also applied a root mean square error (RMSE) to measure the standard deviation of residuals. RMSE is the standard deviation of the prediction errors (residuals) and can measure residual dispersal. A basic formula of RMSE is shown below:

Variable Correlation
The Pearson correlation coefficients were calculated to determine the correlation among variables. The result is illustrated in Table 2. Total electricity consumption (TEC), a dependent variable in this study, appears highly correlated with sum of light intensity (SL), area of lit (AL), and built-up (BU) area. Moreover, the top three correlated variables are also highly correlated among themselves. The total population and household showed moderate correlation coefficients to TEC, while other variables had low correlation values (see Figure 4). The last row in Table 2 presents the weight of each variable to the prediction (dependence variable) by using a correlation calculation. The higher the weight of an attribute, the more relevant it is considered. These results helped the variables (feature) selection process to find the optimal variables, building the models by automatically selecting the appropriate candidate features by their information gain concerning the prediction results [44].

Variables Selection
An automatic feature selection process was used to tune down the number of predictor variables employed by the built models. Adjusting the number of parameters is a significant part of the model building process, which helps improve the processing time of each prediction model. RapidMiner ® Auto Model's Feature selection used a multi-objective evolutionary algorithm to find the best predictor variables (feature sets) concerning the trade-off between complexity and model accuracy [45]. The complexity was determined based on the set of features where each variable in the set contributes complexity. The result of each feature set is Pareto-optimal concerning these two dimensions (complexity and model accuracy). In this study, each model selected a specific feature set using an automatic feature selection process. The result of the feature selection process (predictors for each model) is illustrated in Table 3.

Estimation Models and Validation
The MR model is presented in Equation (9). The variables, including AL, BU, and area, were selected from the feature selection method. The MR model provided the lowest accuracy compared to DT and SVR (see Table 4).  The DT model was created from five attributes (one target attribute, TEC, and four predictor attributes, BU, TH, PD, and area) and their decision label value (see Appendix B Table A1). Total household (0.393) was the highest weight attribute, followed by an area of lit (0.192) and population (0.005). Figure 5b illustrates the DT prediction chart, which led to the best performance in electricity prediction among three models (see Table 4) with an RMSE of 0.24, MAE of 0.12, MSE of 0.06, and the highest R 2 of 0.95. The 210 training datapoints trained the final SVR prediction model with four attributes: TP, TH, SL, BU, and PD (see Figure 5c). The kernel function of each variable was present by w(x i ) in Figure 5c, and the offset is 0.655. As shown in Table 4, the SVR prediction model had the second-best prediction performance. A comparison of the prediction models' performance is illustrated in Table 4, in which DT provided the best performance with the lowest MAE, RMSE, and highest R 2 , followed by SVR and MR. Therefore, DT achieved the highest accuracy and was chosen to estimate the district-level electricity consumption in Thailand's EEC.

High Correlation of Geographic Variables to Electricity Consumption
This study applied the socioeconomic survey data (TH and TP) and extracted geographic data from remote sensing images (SL, AL, and BU). The extracted data from satellite images including SL, AL, and BU presented a very high correlation to electricity consumption (0.77 ≤ R ≤ 0.83) compared to socioeconomic data (TH and TP), which provided R = 0.6. However, the GDP also provided a high correlation to total electricity consumption at 0.83. These results suggest that the variables extracted from remote sensing products could estimate the electricity consumption at the district level. Table 2 illustrates that SL was highly correlated with TEC. Moreover, the built-up area was a common predictor, chosen by all models through the feature selection method (see Table 3). Hence, we conclude that the remotely sensed extraction data could estimate electricity consumption at a district level.

Remotely Sensed Data-High Performance and Ease of Extraction
The past study confirmed that the nighttime light retrieved from the satellite could quantify electricity consumption at a provincial scale [1]. Moreover, we found that the lit area, sum of light intensity, and built-up area could also estimate the electricity consumption at the district level. These geographic variables played a considerable role in prediction models by providing high weight (see Table 3). For example, the built-up area was the second-highest weight (0.2) in the DT prediction model, and the area of lit was the highest weight (0.86) in the MR prediction model. However, the built-up area was more complex in terms of extraction methodology, and it was challenging to control the accuracy of data extraction results. Figure 6 illustrated that the built-up area is highly correlated to the area of lit and sum of light intensity. For this reason, the nighttime light intensity was a better proxy indicator to estimate the electricity consumption at the district level.

Model Complexity and Accuracy, Key Caveats, and Limitations
There were various methods to estimate electricity consumption in past research. However, because of the limitation of the number of datasets, we selected three MLAs, including Multiple Regression (MR), Decision Tree (DT), and Support Vector Regression (SVR), to predict the district's electricity consumption. From Figure 5, it is clear that DT's predicted values showed the closest one-to-one relationship to the actual values, which indicates that the DT model provided the best performance with the lowest MAE, RMSE, and the highest R2. On the other hand, MR and SVR were more likely to underestimate the actual values with weaker goodness of fit than the DT model.
The regression method in machine learning may be challenged by the complexity of predictor variables, which could generate a high bias model. The increasing complexity is possibly caused by the decreasing of model accuracy. Thus, we need to optimize both complexity and accuracy, avoiding a more complex and over-fitting model. Figure 7 compares the trade-off between the model complexity and the error rate of MR, DT, and SVR models.  The MR provided less complexity but the highest error, while DT provided the best performance with moderate complexity but the lowest error. In addition, the model parameter adjusting was a key to the predicting result. We let each model find the optimal parameters that provided the best training result. However, the prediction models may still need more adjusting in the details of each hyperparameter. The most critical hyperparameters are (1) stopping criteria and (2) pruning method, which control the balancing between model accuracy and complexity. Consequently, we need to balance complexity and accuracy, improving the model performance.
In addition, DT is a complex model which tends to overfit. We can avoid this problem by setting the constraints on model parameters and pruning. The constraints on tree size should be set by fine-tuning the hyperparameters such as (1) minimum samples for a node split and a terminal node, (2) maximum depth of the tree (vertical depth), and (3) maximum feature to consider for a split. The tree pruning should be optimized between the size of the learning tree (horizontal grow) and predictive accuracy, then measured by cross-validation set for best optimizing performance. However, based on the number of data points in this study, our DT model was a small tree with a lower risk of overfitting than a large tree. The overfitting problem should be considered when applying the model to a larger dataset in further study.
The model accuracy can be impacted by (1) the quality of the input dataset and (2) model adjusting parameters. Principally, the input data validation may be applied to improve the model accuracy. This study only validated the accuracy of remote sensing data extracting results of 2012 and 2016. We used 500 random sampling points from the national land-use dataset, verifying them with virtual ground truth from Landsat 7 and 8 images. In this work, the accuracy of extracting data for 2012 and 2016 was 78% and 85%, respectively, which was adequate for managing electricity on the district grid in EEC, Thailand, without the dwelling dataset. The more detailed validation of remote sensing data extraction should be conducted because the accuracy of the data extraction process will affect the result of the prediction model. However, if we can improve the extraction accuracy, the models should lead to more accurate predictions. In addition, we assumed that there are no invalid data in the socio-economic dataset following the standard of data source (The National Statistic Office, Thailand). However, this may not be true.
Despite the usefulness of this study's generated models and findings, we acknowledge that some caveats should be mentioned. First, this study's available data were somewhat limited due to data availability in both spatial and temporal domains. The data were available only from 2012 to 2018, covering a small number of districts (30 districts) in the study area. Further study still needs to utilize the hyperparameter tuning of the Decision Tree model in the following metrics: the total number of nodes, total number of leaves, tree depth, and number of attributes used to optimize model accuracy. Moreover, experimenting on different geographic characteristic areas will enhance understanding of the prediction model at the diverse remotely sensed dataset.

Conclusions
This study compared three estimation models for electricity consumption at the district level in Eastern Thailand and evaluated the usefulness of remotely sensed data extraction as well as a general socioeconomic dataset as model features. The result revealed that the Decision Tree (regression method) algorithm is the best-fitting model with the highest accuracy at the lowest MAE, RMSE, and fastest total time than the other models. The findings from this study also highlighted the benefits of utilizing remotely sensed extracted variables, including lit area, sum of light intensity, and built-up area. Specifically, the nighttime light data were significant for estimating electricity consumption at the district level. They can help understand human activity and reflect the economic status and its impact in the industrial parks and neighborhood areas. These data also represent an excellent proxy indicator to measure economic activity, which is valuable for developing countries with a limited budget and data, such as Thailand.
Nevertheless, the nighttime light data capture an artificial light that varies by the intensity of the source of light generated by the bulbs. LEDs have increasingly replaced regular light bulbs in the past decade because they require less energy and provide the same brightness. The changing of light bulb types may reflect the utilization of nighttime light intensity for estimating electricity consumption. Starting from 2015, Thailand's government has promoted LED usage within the 20 years of saving energy plan, which is still ongoing [46]. Thus, this paper, covering the period from 2012 to 2018, assumed little impact from the policy and has yet to consider the effect of different light sources.