VIIRS Nighttime Light Data for Income Estimation at Local Level

: The aim of the paper is to develop a model for the real-time estimation of local level income data by combining machine learning, Earth Observation, and Geographic Information System. More exactly, we estimated the income per capita by help of a machine learning model for 46 cities with more than 50,000 inhabitants, based on the National Polar-orbiting Partnership–Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) nighttime satellite images from 2012–2018. For the automation of calculation, a new ModelBuilder type tool was developed within the ArcGIS software called EO-Incity (Earth Observation–Income city). The sum of light (SOL) data extracted by means of the EO-Incity tool and the observed income data were integrated in an algorithm within the MATLAB software in order to calculate a transfer equation and the average error. The results achieved were subsequently reintegrated in EO-Incity and used for the estimation of the income value at local level. The regression analyses highlighted a stable and strong relationship between SOL and income for the analyzed cities. The EO-Incity tool and the machine learning model proved to be e ﬃ cient in the real-time estimation of the income at local level. When integrated in the information systems speciﬁc for smart cities, the y can serve as a support for decision-making in order to ﬁght poverty and reduce social inequalities.


Introduction
Over the last two decades, due to rapid digitalization and technological development, the "smart city" concept has been widely used both in the scientific literature and international policies. Despite the history and popularity of the concept there is neither a unique, nor a one-size-fits-all definition for it [1]. As cities are engines of economic growth, accounting for 80% of the global Gross Domestic Product (GDP) [2], the y also have a significant impact on social and economic welfare, thus being considered key elements for tackling future challenges. This is also highlighted by the fact that more than half of the global population lives in urban areas (in Europe more than 75 percent), a ratio which is constantly increasing [2]. All these lead to an increased consumption of resources in cities (mainly consumption of energy resources), placing a significant impact on the surrounding environment and on urban sustainability.
Despite the fact that smart cities do have different interpretations [3][4][5][6][7][8][9][10] and have also attracted criticism [11][12][13][14][15][16], it seems that the concept has established itself by becoming embedded in the scientific literature, with ever increasing practical applications. Along the years it has become a widely used policy concept for cities, its popularity being supported by its perceived connection to the globally accepted Sustainable Development Goals [17].
Cities are constantly changing, the reby continuously presenting city managers with new challenges in order to transform urban areas into more livable spaces. The key to successfully adapt to these changes is in the hands of local authorities, which have the responsibility to provide accurate and timely spatial information for monitoring, managing, and planning of urban areas [18,19]. The refore, the smart city concept takes a holistic approach to utilizing information technologies combined with real-time analysis methods, paving the way towards a more sustainable economic development. This holistic view of cities made possible by Earth Observation (EO) offers a useful alternative to data collection over spatial units; thus, allowing a reliable way of gathering spatial data for monitoring the rapidly changing urban areas [20,21]. Satellite-based EO data can help fill some of the gaps in the tools, but also gather useful data and knowledge we need in cities.

Research Overview
Scientific literature reveals a great variety of applications for EO data not only in analyzing the environmental changes, but also in applying it in regional science. Musakwa and Niekerk [22] have monitored sustainable urban development using built-up area indicators. Patino and Duque [23] reviewed the potential applications of satellite remote sensing to regional science research in urban settings by focusing on social problems with a spatial dimension. Application of EO can also be found in assessing social inequality and quality of life by detecting and monitoring the formation of slums [24,25], urban growth and urban sustainability [26,27], but also the estimation of population or GDP [28][29][30]. The se studies highlighted the big challenges of EO data in supporting sustainable urban research and policy development.
The remote sensing and the Geographical Information System (GIS) can provide new perspectives for the assessment of the economic performances both at national and at sub-national level. Over the last period, increasing attention was paid to the usage of nighttime satellite images for the extraction of the socio-economic data corresponding to a certain spatial unit, when this data is missing or has a low quality. In terms of income per capita prediction, Ebener et al. [31] developed a global model based on the DMPS-OLS satellite images, achieving promising results both at national and at sub-national level. Subsequently, Li et al. [32], after studying the relationship between the Nighttime lights (NTL) data and the economic indicators, noticed that the NPP-VIIRS data has a better capacity for the prediction of the gross regional product (GRP) than the Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) data. In the case of the present study, too, the NPP-VIIRS data proved to be a good starting point in estimating the income at local level.
The nighttime light data captured by the satellites in order to assess the economic development is very actual and it is used in many studies lately [29,33,34]. Dai et al. [35] estimated the GDP for 31 provinces and 341 cities in the mainland China, based on the DMSP-OLS and NPP/VIIRS satellite data and on the existing statistical data. As a result of comparing the accuracy of the results, we noticed that the DMSP-OLS data can be used for the estimation of the GDP at province level exclusively, while the NPP/VIIRS data can be used both at province and at local (cities) level. Basihos [36] estimated the GDP at sub-national level in Turkey for the 2001-2013 period, based on nightlights, using the neural network algorithm and rebuilt the GDP series for the 1992-2001 period, achieving statistically significant results. The machine learning model was also used successfully by Jean et al. [37] in estimating consumption expenditure for five countries in Africa based on the satellite images. Määttä and Lessmann [38] applied the random forest machine learning algorithm with a view to predict whether the light corresponding to a pixel in DMPS-OLS nightlight data is of human origin or not; thus, creating a Human Lights product. The machine learning algorithms (artificial neural network) were used by Subash et al. [39] to predict poverty at sub-national level in India based on the DMPS-OLS nightlight data. The NTL dataset is not only useful in updating the analysis of connecting nighttime lighting patterns with income inequalities, but also serves the purpose of validating the robustness of our identification and estimation strategy. Our analysis is in line with this mainstream research using nighttime lights datasets (NPP-VIIRS) and Remote Sens. 2020, 12, 2950   3 of 19   machine learning model for Romanian cities from 2012 to 2018 and for Hungarian cities from 2014 to  2016, in order to estimate per capita income. For this purpose, we selected 46 Romanian cities and  19 Hungarian cities with a population over 50,000 inhabitants. The re is a direct and positive correlation between the size and the economic complexity of cities; that means bigger cities have more complex economic functions, related to industries and a wide variety of services, while smaller cities have a large agricultural sector or one dominant industry (for example, mining) or service function (generally tourism) [40,41]. For these reasons, the re is no significant correlation between the economic output of smaller cities and night lights, the latter being largely generated by industries and service-related functions [30].
The analysis is expected to contribute to the development of novel methods for improved EO-based urban services to support smart and sustainable urban development. This study proposes a new tool (abbreviated EO-Incity in this study) designed to estimate the income at sub-national level based on the nighttime satellite images.

Study Area
Romania is organized into eight development regions (NUTS-2), 41 counties, and the Municipality of Bucharest (NUTS-3) and 3181 LAU1 (local administrative units: communes and cities). For this research, we have selected 46 municipalities and cities (LAU1) with a population of more than 50,000 inhabitants (Appendix C), of the total 103 municipalities and 216 cities in Romania ( Figure 1). The se cities are the major economic hubs of the country [42][43][44], especially the capital-the Municipality of Bucharest-the sixth city in the EU in terms of population (2,131,034 inhabitants in 2019), followed by six regional centers with a population exceeding 300,000 inhabitants: Ias  of cities; that means bigger cities have more complex economic functions, related to industries and a wide variety of services, while smaller cities have a large agricultural sector or one dominant industry (for example, mining) or service function (generally tourism) [40,41]. For these reasons, there is no significant correlation between the economic output of smaller cities and night lights, the latter being largely generated by industries and service-related functions [30]. The analysis is expected to contribute to the development of novel methods for improved EObased urban services to support smart and sustainable urban development. This study proposes a new tool (abbreviated EO-Incity in this study) designed to estimate the income at sub-national level based on the nighttime satellite images.

Study Area
Romania is organized into eight development regions (NUTS-2), 41 counties, and the Municipality of Bucharest (NUTS-3) and 3181 LAU1 (local administrative units: communes and cities). For this research, we have selected 46 municipalities and cities (LAU1) with a population of more than 50,000 inhabitants (Appendix C), of the total 103 municipalities and 216 cities in Romania ( Figure 1). These cities are the major economic hubs of the country [42][43][44], especially the capitalthe Municipality of Bucharest-the sixth city in the EU in terms of population (2,131,034 inhabitants in 2019), followed by six regional centers with a population exceeding 300,000 inhabitants: Iași  With an important rural population (around 48% of the total population), the major municipalities of Romania are the leading places for innovation and smart city solutions, connecting large rural hinterlands with the global flow of capital, goods, and information [46,47]. With an important rural population (around 48% of the total population), the major municipalities of Romania are the leading places for innovation and smart city solutions, connecting large rural hinterlands with the global flow of capital, goods, and information [46,47].

Data Collection
Within this research we used the NPP-VIIRS images [48] produced by the Earth Observations Group (EOG) at NOAA/NCEI (National Centers for Environmental Information) with a 15 arc seconds (approximately 500 m) resolution for income estimation al local level in Romania. The VIIRS (Visible Infrared Imaging Radiometer Suite) sensor of the Suomi NPP (National Polar-Orbiting Partnership) satellite records the intensity of the lights and collects data in 22 different spectral bands, of which one is the DNB (Day/Night Band) band. The Suomi NPP satellite was launched on the 28 October 2011. The VIIRS instrument onboard the Suomi NPP satellite collects global observations, which span over the visible and infrared wavelengths. Day/Night band hosted by VIIRS allows for the observation of nighttime lights with a 15 arc second spatial resolutions. Using nighttime data from the VIIRS Day/Night Band, a suite of average radiance composite images is produced. The se composites are available at a monthly scale for the 2012-2019 period and at an annual scale for 2015 and 2016. In the present study, we used the monthly data. The re are two configurations of the NPP-VIIRS monthly composites: "vcmslcfg"-where data contaminated by stray light are corrected and "vcmcfg"-where data contaminated by stray light are removed [49]. We have used the "vcmslcfg" layer. Based on the satellite images available on a monthly basis, an annual average pixel brightness was calculated using the "vcmslcfg" layer, resulting in seven composite VIIRS nighttime light images. An image has thus been produced for each year corresponding to the 2012-2018 period. Subsequently, the sum of light (SOL) calculation was conducted by means of the ArcGIS 10.1 software (Environmental Systems Research Institute, Redlands, CA, USA) for each administrative territorial unit in Romania, which was later used as an input in calculating the local income. The sum of light (SOL) was extracted by summing all pixel values of the nighttime light image in each administrative territorial unit.
The permanent resident population data by localities was obtained from the National Institute of Statistics [45] for the 2012-2018 period. The local tax income data for each administrative territorial unit were extracted from the MRDPA (Ministry of Regional Development and Public Administration) database for the 2012-2018 period [50] and are expressed in Romanian leu (1 RON = 0.21 EUR, June 2020). Subsequently, the local tax income data was calculated per person.

Methodology
Several studies highlighted a positive statistical relationship between the NTL data and GDP, both at national level and at sub-national level [33][34][35]51,52]. For Romania, Ivan et al. [30] noticed a strong relationship between NTL and income, as well as between NTL and the GDP at county level [53]. The NTL data was correlated best with the income, which indicated its high potential to estimate the income at sub-national level.
The research on the relationship between SOL and local income for each administrative territorial unit of Romania (3181) highlighted a strong relationship (Pearson correlation coefficient ≥ 0.82, R 2 > 0.68) between the two variables for the municipalities and cities with population exceeding 50,000 inhabitants (46 cities) ( Table 1).
Based on these findings we used the local income estimation for the year 2018 for the 46 cities in Romania using a machine learning model within the MATLAB software.
For the automation of the calculation we proceeded to the development of a ModelBuilder type tool in the ArcGIS software ( Figure 2). The tool was called EO-Incity (Earth Observation-Income city). The Model Builder is a visual programing language, where geoprocessing and data processing tools can be interactively combined to create a workflow. For the creation of the model two sub-models were first developed: the first to mask the study area using the "Extract by Mask" tool and the second sub-model to calculate the annual average (for the images available on a monthly basis) using the "Cell Statistics" tool. The "Extract by Mask" tool extracts the cells of a raster, which correspond to a country limit defined by the input mask, while the "Cell Statistics" tool calculates a per-cell statistic (mean in our case) from 12 monthly rasters.    In order to calculate the sum of light (SOL) for each individual administrative territorial unit, the "Zonal Statistics as Table" tool was used. The "Zonal Statistics as Table" tool calculates statistics (sum in our case) on the values of a raster within each administrative territorial unit. The data preparation process involves the following processes: (1) masking the downloaded raw monthly VIIRS satellite images (vcmslcfg layer) using the limit of the study area (country limit in our case); (2) calculating an average annual pixel brightness for the 12 masked monthly images, resulting a single nighttime light image; and (3) summing up the pixel values of the image for each territorial administrative unit, resulting the sum of light (SOL) values for each city. In case of annual images, step (2) was skipped. The EO-Incity tool in the first part of the model masks the study area and calculate the SOL, using as input data the annual or monthly VIIRS satellite images. In the second part of the model it uses as input parameters the calculated SOL data and the equation resulting from running the algorithm within MATLAB for the estimation of the income. (Appendix A).
In the MATLAB software, the calculation algorithm transposed in a script involves on the one hand the achievement of the parameters of a transfer function for the estimation of the income based on the nighttime satellite images and, on the other hand, the calculation of the average error (Appendix B). The input comes from the EO-Incity tool and it is represented by the sum of light (SOL) and the per capita income corresponding to the cities that exceed a certain population threshold (for this particular study, the cities with a population over 50,000 inhabitants). The parameters of the transfer equation are calculated integrating all the annual value sets generated from satellite images and by the processing of the income data in one-step learning algorithm. The normal equation was chosen because it is an efficient and fast calculation manner due to the analytical approach of the linear regression: where θ-parameters of the linear regression; X-the matrix which contains the independent variables; and y-the matrix containing the dependent variables. The same normal equation is used for the calculation of the errors of linear regressions for each annual set of values. The errors for each year are then mediated for the entire calculation period.
where E j -the values estimated using the regression equation for year j; θ j -the parameters of the linear regression for year j; X j -the independent variable for year j; y j -the dependent variable for year j; ε j -the error of linear regression for year j; and ε-the average error; n-number of years. The output of the MATLAB script is then reintegrated in EO-Incity and used for the estimation of the income values based on satellite data (SOL) for the years when this data is not available for the reduction of errors. The income data resulting from running the EO-Incity tool are saved in.dbf format, which makes them easy to view and process.
We used the linear regression model in order to test the relationship between the estimated income and night time light/observed income, which is described mathematically in the form: where y represents the dependent variable (in our case the estimated income); x represents the independent variable (representing the SOL/observed income); β0 is the so-called intercept of the model-the expected value of y when all the x's are zero; β1 is the coefficient (multiplier) of the variable xi; and ε is the residual variable (the mean or expected value of y for a given value of x). The betas together with the mean and standard deviation of the y are the parameters of the model. In order to find out the accuracy in regression models, R2 and its adjusted counterpart aR2 are highlighted, deriving from the regression equation to quantify model performance. While the R2 shows the validity of the chosen model for explaining the variation of Y (the percentage of income estimation explained by the NTL), adjusted R2 is a correction to the R2 that takes into account the number of variables used in the model. The higher the R-squared, the better the model.
The predictive ability of the model is often tested by using the relative error (RE) or residuals (the difference between the observed y values and the predicted y values) and relative root mean square error (RMSE). RMSE measures the average error performed by the model in predicting the outcome for an observation. As the square root of a variance, RMSE can be interpreted as the standard deviation of the unexplained variance and it has the useful property of being in the same units as the response variable. RMSE provides a good picture of how accurately the response is predicted by the model and it is the most important criterion for fit if prediction is the main purpose of the model. In this sense, lower values of RMSE indicate better fit.
Additionally, Akaike information criterion (AIC) is a fined technique based on in-sample fit to estimate the likelihood of a model to predict/estimate the future values. The basic formula is defined as where K is the number of model parameters (the number of variables in the model plus the intercept) and Log-likelihood is a measure of model fit. The basic idea of AIC is to penalize the inclusion of additional variables to a model which, in fact, increases the error. The lower the AIC, the better the model.

Results
In the socio-economic calculations the income represents an important parameter for decisionmaking in order to fight poverty and reduce social inequalities. On the annual scale, an offset occurs between the statistically reported data and the specific budgetary projection. Our instrument offers the possibility of real-time estimation of this data at local level by means of Earth Observation (EO).
As a result of running the EO-Incity tool for the cities with more than 50,000 inhabitants in Romania, we achieved an estimation of the income for the year 2018. After comparing the estimated results, we achieved a regression coefficient of 0.86 that explains 75% of the total variation and a root square mean error (RMSE) of 247.
Taking into consideration the values of the regression values, the highest negative differences (overestimation) are noticed for the cities in the east of Romania (Figure 3). The possible source of error is the overestimation of the population due to the practice of the citizens in the Republic of Moldova to declare their presence in these cities (Vaslui, Bârlad, Ias , i) to obtain Romanian citizenship [54][55][56]. As for the positive difference (underestimation), the largest differences were noticed in Slatina, where there is a big electricity consumer, the largest aluminum producer in Europe, Alro SA, which also generates higher SOL values. In the case of the cities of Sibiu, Deva, Cluj-Napoca, and Sfântu Gheorghe, a possible explanation would be the underestimation of the resident population or/and informal economy (see for this subject Ghosh et al. [33]). In the case of Cluj-Napoca-one of the most important academic centers in Romania-another explanation would be the large number of students present, but registered as residents elsewhere. The positive difference in Cluj-Napoca and other cities like Sibiu and Brasov may also be explained by tourism. Bluhm and Klaus [57] pointed out that stable lights may underestimate income in high-density places, the refore they recommend using top-coding correction. In our analysis we have found several cities of similar size (e.g., Cluj and Iasi; Braila and Sibiu) where over-and underestimation of income per capita was related to the specific characteristics of Romanian cities mentioned above and not necessarily to the high density of the population. This is also backed by   Figure 4). Thus, the value of the R2 coefficient was located between 0.68 in 2017 and 0.76 in 2013, which illustrates a stable relation in time between the two variables (SOL and income). In the context of these findings, we proceeded to error reduction in order to improve the prediction. A good strategy for improving the prediction is to minimize the term ε in Equation (5) as much as possible [58]. We intend to use errorrelated information for the common period with satellite and revenue data to characterize the specific error for each particular case (city). The mean error achieved by the mediation of the regression errors each year is applied to the values estimated for the year 2018. Table 2 shows a significant improvement of the correlation coefficient between the estimated and the observed values, which in this case exceeds the value of 0.93, while the total explained variability increases from 75 to 87%. RMSE is significantly reduced from 247 to 181. As a result of the European Union's efforts to reduce electricity consumption by 80%, both Romania and Hungary are in full transition to LED lighting. The use of this technology implies a decrease in DNB radiance.
This decrease is most often outweighed by the increase in radiance in areas around cities due to urban sprawl or the "skyglow" effect (the clear sky predominantly scatters short-wavelength light) [59]. An increase of more than three times of the scotopic illuminance outside the city was also observed in Hungary when using LED technology [60]. This study uses a radiative transfer model  (Figure 4). Thus, the value of the R2 coefficient was located between 0.68 in 2017 and 0.76 in 2013, which illustrates a stable relation in time between the two variables (SOL and income). In the context of these findings, we proceeded to error reduction in order to improve the prediction. A good strategy for improving the prediction is to minimize the term ε in Equation (5) as much as possible [58]. We intend to use error-related information for the common period with satellite and revenue data to characterize the specific error for each particular case (city). The mean error achieved by the mediation of the regression errors each year is applied to the values estimated for the year 2018. Table 2 shows a significant improvement of the correlation coefficient between the estimated and the observed values, which in this case exceeds the value of 0.93, while the total explained variability increases from 75 to 87%. RMSE is significantly reduced from 247 to 181. based on Monte Carlo simulation to make a comparison between different indicators used to qualify and predict the change in radiance in case of different scenarios related to transitions to LED lighting. The impact of the decrease in radiance-due to LED transition-on the relationship with personal income can be limited by our methodological approach. The use of SOL attenuates the As a result of the European Union's efforts to reduce electricity consumption by 80%, both Romania and Hungary are in full transition to LED lighting. The use of this technology implies a decrease in DNB radiance. This decrease is most often outweighed by the increase in radiance in areas around cities due to urban sprawl or the "skyglow" effect (the clear sky predominantly scatters short-wavelength light) [59]. An increase of more than three times of the scotopic illuminance outside the city was also observed in Hungary when using LED technology [60]. This study uses a radiative transfer model based on Monte Carlo simulation to make a comparison between different indicators used to qualify and predict the change in radiance in case of different scenarios related to transitions to LED lighting.
The impact of the decrease in radiance-due to LED transition-on the relationship with personal income can be limited by our methodological approach. The use of SOL attenuates the decrease in radiance observed in downtown areas. Another advantage is the use of machine learning that can adapt to such a change.

Validation of the Results and Discussions
In order to validate the EO-Incity tool we used the similar methodology, for the estimation of the income value at local level for Hungarian cities. The population and local tax income annual data were extracted from the Hungarian Central Statistical Office [61] for the 2014-2016 period for Hungary. The income data are expressed in Hungarian forint (1 HUF = 0.0029 EUR, June 2020) and have been reported per person.
Hungary is divided into eight regions (NUTS-2), 19 counties and capital city-Budapest (NUTS-3) and 3155 LAU2 (local administrative units: settlements). In the present study, we have selected 19 municipalities and cities with a population of more than 50,000 inhabitants ( Figure 5 and Appendix C). The largest cities from Hungary with a population exceeding 100,000 inhabitants are the capital city- Budapest (1,693,051  decrease in radiance observed in downtown areas. Another advantage is the use of machine learning that can adapt to such a change.

Validation of the Results and Discussions
In order to validate the EO-Incity tool we used the similar methodology, for the estimation of the income value at local level for Hungarian cities. The population and local tax income annual data were extracted from the Hungarian Central Statistical Office [61] for the 2014-2016 period for Hungary. The income data are expressed in Hungarian forint (1 HUF = 0.0029 EUR, June 2020) and have been reported per person.
Hungary is divided into eight regions (NUTS-2), 19 counties and capital city-Budapest (NUTS-3) and 3155 LAU2 (local administrative units: settlements). In the present study, we have selected 19 municipalities and cities with a population of more than 50,000 inhabitants ( Figure 5 and Appendix C). The largest cities from Hungary with a population exceeding 100,000 inhabitants are the capital city- Budapest (1,693,051   Income inequalities have a clear spatial distribution in Hungary: apart from the capital city, there is a strong decrease in the income from west to east. The most developed areas can be found in the central and north-western parts of the country, with a strong peripheralization in the eastern parts. The economic spatial structure of the country has also put its mark on the cities, influencing the adoption of the "smart city" concept, which appeared mostly in the largest urban areas. The need to apply the smart city concept in urban development appeared in 2015 when a government resolution established the state regulatory and oversight responsibilities for smart city developments [62]. Since central and north-western parts of the country, with a strong peripheralization in the eastern parts. The economic spatial structure of the country has also put its mark on the cities, influencing the adoption of the "smart city" concept, which appeared mostly in the largest urban areas. The need to apply the smart city concept in urban development appeared in 2015 when a government resolution established the state regulatory and oversight responsibilities for smart city developments [62]. Since then, the spread of the smart city model has accelerated significantly in Hungary especially in the larger urban areas: the Smart city Budapest initiative has been running in the capital since 2014, Debrecen and Szeged have had a smart city vision and concept since 2016. Miskolc, while Kaposvár and Szolnok joined the Open and Agile Smart Cities organization in 2016 [63]. As the smart city concept has a strong connection with sustainable development, Hungary has also committed itself to adopt a series of sustainable development goals (SDG). In this context, the use of earth observation solutions for an improved monitoring of the SDG implementation progress has also become a subject for discussion.
We have chosen Hungary for validating our results for two reasons: on one hand, from a historical point of view, as the northern part of Romania belonged to Hungary between the two world wars, showing a strong ethnic and cultural connection with the neighboring country. Secondly, from an economic point of view, as Romania and Hungary have a strong economic collaboration: a large number of Hungarian SMEs (Small and Medium size Enterprises) have been established in Romania and vice-versa in the last years, facilitating a strong cooperation between the two countries, in addition to the several cross border cooperation projects implemented on both sides of the border. Furthermore, even though the economic profile and size of the two countries differs greatly, in the last years the two countries have shown one of the highest rates of economic growth among EU Member States (4.6% in the case of Hungary and 4.2% in case of Romania in 2019 Q4) [64].
In the case of the Hungarian cities, the regression analyses highlighted a stable and strong relationship between the observed income and the estimated income (R 2 > 0.95) for each year during the 2014-2016 period ( Figure 6 and Table 3).
Remote Sens. 2020, 12, x FOR PEER REVIEW 11 of 19 concept has a strong connection with sustainable development, Hungary has also committed itself to adopt a series of sustainable development goals (SDG). In this context, the use of earth observation solutions for an improved monitoring of the SDG implementation progress has also become a subject for discussion. We have chosen Hungary for validating our results for two reasons: on one hand, from a historical point of view, as the northern part of Romania belonged to Hungary between the two world wars, showing a strong ethnic and cultural connection with the neighboring country. Secondly, from an economic point of view, as Romania and Hungary have a strong economic collaboration: a large number of Hungarian SMEs (Small and Medium size Enterprises) have been established in Romania and vice-versa in the last years, facilitating a strong cooperation between the two countries, in addition to the several cross border cooperation projects implemented on both sides of the border. Furthermore, even though the economic profile and size of the two countries differs greatly, in the last years the two countries have shown one of the highest rates of economic growth among EU Member States (4.6% in the case of Hungary and 4.2% in case of Romania in 2019 Q4) [64].
In the case of the Hungarian cities, the regression analyses highlighted a stable and strong relationship between the observed income and the estimated income (R 2 > 0.95) for each year during the 2014-2016 period ( Figure 6 and Table 3).    The results illustrate the validation of the methodology described in the present paper for Hungarian cities over 50,000 inhabitants. The EO-Incity tool can be used on a regional scale (being validated in Hungary as well) in filling in data gaps or collecting new data at city level. This tool created to estimate the income at local level based on the VIIRS nighttime satellite images enables fast, automated, and real-time estimation of the income. A better spatial resolution of nighttime light data would bring improvements to small-scale income estimation. Luojia1-01 nighttime light satellite images with a spatial resolution of 130 m compared to NPP/VIIRS (500 m spatial resolution) contain more information and spatial details compared to DMSP-OLS/NPP-VIIRS nighttime light data and can provide more satisfactory results on a small scale [65]. In the future, we propose to estimate the income at local level using data from the Luojia1-01 satellite, as opposed to the data obtained from the NPP-VIIRS satellite. Luojia1-01 was launched in June 2018, so the main limitation of these images compared to NPP/VIIRS is the lack of multi-temporal images, which limits their use in the analysis of time series. In addition, other valuable sources of nightlights data are represented by nighttime photographs from the International Space Station with a moderate spatial resolution (between 5 and 200 m), as well as maps of urban nighttime lights at fine spatial resolutions obtained from airborne sensors for studying socio-economic properties of cities [66].
The income estimation tool developed in the paper (EO-Incity) has not provided any evidence for a hypothetical correlation with the functional typology of the cities. However, the theoretical positive correlation between city size and the functional complexity of cities highlights the importance of overestimations in the population for cities located close to the Eastern borders of Romania and for those cities with important educational functions (Cluj-Napoca and Ias , i).

Limitations
Even if the correlations between income and night images are statistically representative and the explanation of the total variability are very good, currently, the extension of the estimation method on a regional scale is limited by the short period with common income data and satellite images (2014 to date).
Another limitation is the spatial inhomogeneity of cities caused by different urban development strategies from one country to another. However, we must not forget about the different methodology of reporting income data and the problems related to data reliability. We must notice also the extern causes like new lighting technologies or underestimation of income in high-density places.
The third limitation of the study is related to the resolution of NPP-VIIRS satellite images with a resolution of 500 m, compared to the resolution of Luojia1-01 images of 130 m, which do not offer a very high accuracy at the scale of urban areas. It is also necessary to compare nighttime satellite images in estimating income in high-density places before and after using top-coding correction.
A successful strategy for reducing constraints would be to introduce more cities in machine learning processing and to develop a methodology for calculating the transfer function at the regional level instead at national one.

Conclusions
EO-Incity proved to be an efficient tool in estimating the real-time income at local level. The results have a high potential of being integrated in the information systems specific to smart cities, as a support for decisions on the more efficient allocation of the financial resources to fight poverty and reduce inequalities. In this way, the y may have an important contribution to the improvement of the city management and planning, and, indirectly, to the improvement of the quality of life in cities adopting smart solutions.
The regression analyses illustrated a very powerful relationship between SOL and income for the cities with more than 50,000 inhabitants. This finding is in line with our previous research [30], and it underlines the direct relationship between city size, the economic complexity of cities on one side, and the intensity of nightlights, on the other side. The combination of the machine learning model with the EO-Incity tool, combined with a strategy for error reduction enabled an efficient and real-time estimation of the income, which explained 87% of the total variability.
On the horizon, we wish to implement the methodology and the tool in other countries too, both in Europe and in other continents. Moreover, we intend to conduct the reconstruction of income based on this tool in the countries where the statistical reporting is not available or reliable.