Nexus between Household Energy and Poverty in Poorly Documented Developing Economies—Perspectives from Pakistan

: The indicators measuring socioeconomic wellbeing, such as the human development index (HDI) and multi-dimensional poverty indicator (MPI), recognize energy as an important resource for human development. However, energy did not ﬁnd due weight in determining HDI or MPI, except as a fractional contributor to MPI calculations. This study presents a regression model to establish an energy–poverty nexus in Pakistan, utilizing a real-world dataset. Deﬁning poverty in terms of per-capita income (PCI), the proposed model incorporates education-based parameters along with the energy-dependent indicators linked to households in Pakistan. The data aggregated at districts level are extracted from the Census 2017 campaign, Pakistan Bureau of Statistics (PBS). Statistical analyses indicate that energy-based identiﬁers correlate well with the PCI and augment the education-only model, capturing 94% variability in PCI vs. 78% for the education-only model. The study highlights the criticality of relevant data collection and data-driven planning in Pakistan for creating synergy in energy planning and poverty alleviation programs and provides recommendations for considering energy as an important and integral contributory factor in the human development index (HDI).


Introduction
Energy, similar to food, clothing, and shelter, has long been an essential human need. As the world moved towards a more civilized living and increased mechanization, energy's role in attaining other human needs became apparent [1]. When energy-related developments started gaining momentum during the 20th century, access to and sustained availability of energy started impacting every facet of human life and development. In particular, last few decades saw increased interest of researchers in exploring energy's role in the societies' socioeconomic wellbeing [2][3][4][5][6][7]. In 1990, Alam et al. found a "significant" link between the physical quality of life and per-capita energy consumption using "World Energy Supplies" data, 1950-1974 [2]. During the year 2000, Alan D Pasternak, utilizing data from World Energy Supplies, 1997, delved into a quantitative relationship between revealing how dependence on energy imports drains the resources of a poor country, resulting in the worsening of the overall state of poverty.
Despite the energy-poverty nexus being a reality, energy has not found its rightful place in studies and in poverty alleviation programs in developing countries including Pakistan. Numerous studies have already established the impact of household energy choices on education, health, quality of life, and wellness in society, indicating the interactive relationship between these parameters [3,15,24,25]. However, no research has quantitatively linked household energy type and wellness indicators such as per-capita income (PCI). Available literature is also limited in size and scope for linking the required amount and type of energy for raising the low-income population in Pakistan above the poverty line [17,21,23]. To the best of our knowledge, the energy-poverty nexus is not fully examined within the context of developing economies including Pakistan. Consequently, there has been no realization of the critical need for relevant data collection either. The availability of appropriate and reliable data could lead to synergized and sustainable poverty alleviation programs incorporating measures for the provision and productive use of energy [11,20]. Research on the impact of low-density, unclean, and unaffordable energy resources on the state of poverty is also limited [18,23,26,27]. Pakistan does not have any energy-linked poverty indicators based on which its energy policy can be aligned to contribute towards the reduction in poverty. The realization of the energy-poverty nexus within the context of developing economies in general, with particular reference to Pakistan, would help to align the poverty-reduction strategies and efforts.
It is well understood that education has a profound effect on alleviating poverty and improving earning abilities [24,25,28,29]. In this paper, in addition to education, we explore a statistical approach to establish the energy-poverty nexus through examining the close relationships between poverty levels, living standards, and types of household energy access and consumption at the district level across Pakistan. The energy-poverty nexus is established by examining and analyzing the household data collected through the yet-unpublished Census campaign in 2017 [30]. Our analyses provide an improved understanding of energy-poverty interplay, highlighting its oversized role in poverty/wellness, defined in terms of per-capita income (PCI). Further, this research attempts to provide a clear linkage between the PCI, and education, other economic indicators (i.e., living standard, etc.), and the type of household energy sources. This is likely to provide an insight to the policy makers to incorporate the household energy in the strategic planning for poverty alleviation programs.
To establish the energy-poverty nexus, we used linear regression analysis. Regression analyses are suitable for predicting continuous dependent variables, i.e., PCI, based on independent variables including education, living standards, and types of energy consumed. To this end, this paper is organized as follows: Section 2 explains the data, methods, and model used for the analysis. The results are presented in Section 3. Section 4 discusses the results, and the findings are summarized in Section 5, followed by study limitations in Section 6 and the conclusion in Section 7.

Data Collection and Preparation
The socioeconomic wellbeing of a society is measured in terms of HDI, MPI, and/or PCI, among which HDI and MPI are better indicators. HDI, comprising three parameters, incorporates education and income. MPI has education, cooking fuel, sanitation, drinking water, electricity, and housing assets as six of its seven contributory elements. The districtlevel Census 2017 data for this research came from the Pakistan Bureau of Statistics (PBS). The data contain different parameters such as demography, literacy rate, education level, employment including that in foreign countries, homeless people, category, vintage and ownership status of housing, household facilities, the type of energy used for cooking and for lighting, household density, water, and access to media [30]. Unfortunately, HDI and MPI data covering the entire population of Pakistan is/was not available. On the other hand, while PCI/income is not the optimum choice for measuring human development, it is often considered as the right indicator to evaluate poverty [7,15]. Available PCI data aggregated at the district level pertaining to federal capital and provincially administered districts (total 117 districts) were, therefore, obtained from the Pakistan Bureau of Statistics.
Thus, the explanatory variables for this study are based on selected parameters from both HDI and MPI besides the factors evaluated in earlier energy-wellness-related studies [2,7,15,20,23,24], whereas PCI, which depends on the earning ability and productivity of the household members, has been used as the outcome (dependent) variable. The PCI values are the average income per person per year for the respective district.
A sample size of 31 out of 117 districts was randomly selected to provide equal representation in all four provinces of Pakistan. The randomization ensured removal of any possible bias in our analysis, and the sample size of minimum 30 data points ensured t-distribution approaching the z-distribution [31]. All independent parameters pertaining to a given district were converted into percentages with respect to the population and number of households of that district.
The selected parameters, termed predictors, are grouped into five groups or categories, with 20 predictors in total:

Preliminary Data Analysis
As a preliminary step, the data matrix consisting of the above independent variables (or predictors) and the dependent variable, PCI, were subjected to correlation analysis. In that, the district-wide average or mean values of predictors showed a correlation (r) of 0.73 (p, 0.000) with the PCI. Additionally, strong positive and negative correlations were observed within various predictor variables. Literacy was correlated with primary and SSC (r = 0.85) and SSC with degree (r = 0.78). Cooking and lighting energy exhibited negative correlations: a very strong negative correlation (r = −1) between "gas" and "wood", the two cooking fuel types, means the houses with gas supply do not need wood for cooking, and vice versa; a reasonable negative correlation (r = −0.78) between "electricity" and "K2 oil", the two lighting sources, indicates that there is less likelihood that households with electric availability will need K2 oil for lighting; a correlation (r = −0.94) value between "pakka" and "kacha"-type houses implies that the two types are inversely interrelated. Additionally, all predictors are linear, ranging from 0% to 100%. This preliminary analysis indicated that a linear regression model may be suitable to provide a linkage between the predictors and the outcome, PCI, provided the following assumptions are met: (1) no predictors are perfectly correlated with each other (collinearity), (2) residuals have constant variance (homoscedasticity), (3) residuals are normally distributed, and (4) residuals are not correlated with each other (autocorrelation). We decided to explore the linear regression model via ordinary least squares (OLS) as well as Ridge regression. Since strong multicollinearity is indicated by the predictor variables, we decided to resolve this first as shown below. Ridge regression was discontinued after resolving the collinearity issues, as OLS provided adequate estimates. Detailed residuals analyses are presented in Section 3.

Data Analysis: Resolving the Collinearity Issue
In statistical analysis, two (or more) predictor variables are subjected to a multicollinearity test, since the phenomenon of multicollinearity leads to skewed results in regression models [32,33]. Therefore, the selected parameters have been subjected to two-step analyses. The first step as explained below explores the existence of collinearity, resolves it by dropping the redundant independent variable(s), while retaining the remaining variables for regression analysis. During the second step, the retained variables are subjected to regression analysis.
The data matrix X consisting of 20 predictors (columns) and 31 districts (rows) are tested to establish a suitable predictive regression model. Since regression analyses are very sensitive to collinearity within the data matrix, the data matrix "X" is analyzed using Belsley [33] collinearity diagnostics function "collintest" in MATLAB ® software. This test provides the "condition indices (CIs)" and the "variance-decomposition proportions (VDPs)" of the data matrix "X". The CIs identify the number and strength of near dependencies in the data matrix "X", whereas VDPs identify groups of predictors with interdependency coefficients between 0 and 1, and the extent to which the dependencies may degrade the regression. This test identified five interdependent groups with greater than 10 CIs (typical) and greater than 0.5 VDIs, namely, literacy and primary in PP, bath and toilet in HF, and all predictors within HT, CEF, and LES. In light of this test, 10 predictors x1 through x10 (primary, SSC, degree, employed, pakka, potable water, kitchen, bath, wood, and electricity) were retained for subsequent analysis. These predictors were tested again for collinearity, and the resultant CIs and VDPs are shown in Figure 1 for each predictor variable. As can be observed in Figure 1, there is a mild collinearity between variables x1, x2, and x10 and variables x6 and x8. However, these are very close to the VDPs tolerance of 0.5, indicating a marginal influence on regression, thus retained.

Data Analysis: Proposed Regression Model
The abovementioned 10 predictor variables along with the corresponding per-capita income (PCI), labelled as "y", was tested for linear regression fit through the "stepwiselm" function of the MATLAB ® with the p-value of F-statistics less than or equal to 0.05. This function creates a linear regression model using stepwise regression to add or remove predictors, starting from a constant model. At each step, the function searches for terms to add to the model or remove from the model, based on the p-value. This resulted in the following form of the regression model.
where β 0 is constant intercept, β i for i = 1, 2, . . . , 10 are the coefficients or weights for each predictor variable, x i . Further, there are two interaction or interdependent terms, x 3 , x 9 and x 9 , x 10 , along with their respective coefficients, β 39 and β 910 .
The model in Equation (1) indicates that primary, SSC, degree in PP, pakka in HT, bath in HF, wood in CES, and electricity in LES might be the important predictors of per-capita income (PCI) in Pakistan. Table 1 provides pertinent statistics related to this model. ANOVA summary statistics are in Appendix A Table A1. As is evident from Table 1, model (1) accounts for roughly 94% variability in PCI, with the p-value being extremely small, indicating a robust fit and rejection of the null hypothesis. Similarly, the individual regression coefficients in model (1) have F-statistics-based p-values much less than 0.05, indicating a reasonably strong fit for each predictor. Further, regression models are used typically to provide interpolated predictions of y (in this case, PCI) for scenarios in which the data for predictor parameters are available. The proposed model (1) is likely to provide reasonable PCI estimates if the model variables, x i , i = 1, 2, 3, 5, 8, 9, 10 are within the min/max limits in Table 1. All predictors except x9, wood, individually are correlated positively with the PCI, as shown in Table 2 below. Therefore, PCI is likely to increase with an increase in all predictors except x9, wood. The differences between the magnitude and the signs of the coefficients in the regression model and the individual/independent correlation coefficients could be explained as follows: The regression model attempts to minimize the sum of the error squared between the regression-predicted PCI and the given PCI. The resulting weights (regression coefficients) and their respective signs (+/−) are assigned during this minimization process to estimate a linear line, as shown in Figure 2. As an example, x2 (SSC) variable in Table 1 has a regression coefficient of (−18.2715), indicating that PCI will be depressed 18.2715 times for 1 unit increase in the SSC level of education of the underlying population. This inference is obviously not correct as the above correlation coefficient indicates for x2. The weights and signs of regression coefficients are, therefore, adjusted to provide a regression model that minimizes the squares of the error.   (1) and its fit to the data. This figure also provides 95% confidence bounds, indicating that (1) is an appropriate model to represent PCI in Pakistan. (1) The model in Equation (1) has two interaction terms, x 3 and x 9 (degree and wood) and x 9 and x 10 (wood and electricity). Both these interaction terms reinforce the validity of our hypothesis: there is a strong correlation between the type of energy available to and in use by the households and the income at the district level in Pakistan. Figure 3 shows the interaction between the percentages of degree holders and the users of wood. This suggests that PCI is depressed, as the use of wood increases until the degree holders' percentage is less than 5%. On the other hand, PCI is higher or likely to increase if the degree percentage is above 5% and the use of wood increases. This scenario points to the fact that while the increased use of wood indicates lowering the PCI of a household, more than 5% of members of the society with a higher education (degree) can compensate and improve the earning potential even in gas-deprived districts. Notwithstanding the impact of higher education, the interaction terms' relationship also indicates that increased use of poor-quality, low-density cooking fuel has depressing effects on PCI until another factor mitigates its impact.

Interaction Terms in Model
As seen in Figure 4, there exists a strong negative correlation between x 9 and x 10 (wood and electricity) beyond certain percentages of wood and electricity. Figure 4 indicates PCI increasing as the use of electricity increases provided the use of wood is less than 50%. Whereas, we see the PCI becoming depressed when the use of wood is beyond 50%, even if the electric connectivity is higher. This interactive relationship again confirms the dominant impact of poor-quality, low-density, and labor-intensive cooking fuel on the earning abilities of a household, even when they are provided with a better-quality and convenient lighting source. This phenomenon leads to the inference that it is necessary to improve the quality of energy for all the households' needs for eradicating poverty.

Residual Analysis of the Model (1)
Residuals, (y −ŷ), are helpful in detecting outlying PCI values and checking error term assumptions in the regression models. Three of the four assumptions (collinearity, normality, autocorrelation, and homoscedasticity) mentioned in Section 2.2 are analyzed in this section. Collinearity is already discussed under Section 2.2.
Normality: Shown in Figures 5 and 6 are two plots pertaining to the residuals of model (1) [33,34]. Figure 5 is Cook's distance vs. rows of observations, i.e., districts. Cook's distance is useful for identifying outliers in the data [33,34]. An observation with Cook's distance much larger than three times the mean Cook's distance is possibly an outlier. Figure 5 indicates that four districts fall slightly outside the range established by the Cook's distance. These four districts are, however, not considered as outliers when viewed in conjunction with the normal distribution plot in Figure 6. Figure 6 indicates that residuals are normally distributed, as assumed in the model; therefore, these four districts can be retained without violating normality assumptions.  Autocorrelation: Figure 7 shows the sample autocorrelation function or correlogram of raw residuals resulting from the difference between the fitted PCI and the observed PCI. The residuals are indicated within the 95% confidence bounds and are, thus, without significant serial correlation. This is further tested using Durbin-Watson (DW) [35] and Ljung-Box Q (LBQ) [36] tests for residual autocorrelation in MATLAB. Both tests assume no serial correlation as null hypothesis and return statistics upholding or rejecting this. DW statistics range from 0 to 4, with values between 1.5 and 2.5 indicating no significant serial correlation. For the data under consideration, DW statistics were 2.2 with a p-value of 0.78, indicating that no significant autocorrelation exists among the residuals. The LBQ test has the additional flexibility of testing at various lags. Since there exist some mild serial correlation at lags 2, 4, 5, 9, and 19, the LBQ test is used to identify this. The LBQ test returns either 0 (not rejecting null hypothesis) or 1 (rejecting null hypothesis) with a respective p-value at each location. The statistics for all lags were 0 with p-values ranging from 0.11 to 0.35, indicating that no serial autocorrelation exists among the data under consideration. Homoscedasticity: Homoscedasticity refers to all residuals having same variance. We used MATLAB's residual diagnostic function to test this. Shown in Figure 8 are residuals vs. fitted PCI. Although an obvious trend is not visible in Figure 8, we further tested the data to rule out possible heteroscedasticity. To this end, Breusch-Pagan (BP) [37] and Engle's autoregressive conditional heteroscedastic (ARCH) [38] tests were used. Both tests assume null hypothesis with no heteroscedasticity and return statistics upholding or rejecting this. The returned p-values of 0.6829 and 0.1023 for BP and ARCH, respectively, indicate upholding the null hypothesis so that there is no significant heteroscedasticity in the residual data. Comparing residuals resulting from PLS and OLS: We compared residuals from PLS with those of OLS in Figure 9. The mean and standard deviation of the residuals, respectively, for OLS and PLS are 0.00 and 17.75 and 0.00 and 29.93. Although we did not analyze PLS extensively, the residuals indicate an almost identical performance for both methods.

Comparison of the Proposed Model (1) with Education-Only Model
Several researchers have established the overarching impact of education on the individuals' earning abilities and the households' income [24,25,28,29]. In this section, a comparison is provided for the education-only model with that of the proposed model (1). For Census 2017 data, the education-only model accounts for 78% variability in PCI, as shown in Table 3. The mean regression plot pertaining to this model is shown in Figure 11. Comparing Figures 11 and 12 and Tables 1 and 3, model (1) captures 94% variability in PCI as compared to the education-only model with 78% coverage. Therefore, the proposed model (1) incorporating energy-related variables is more inclusive and better for estimating PCI for Pakistan. The proposed model augments the education-only model and, thus, establishes the relevance of household energy towards affluence/poverty in Pakistan. This model may be applicable in other developing economies too.

Findings
In light of Sections 2 and 3, we summarize the findings as follows: • The data used in this research, though collected over extended time and by numerous individuals, are reliable, reflecting the real-life on-ground situation in Pakistan.
Although this study has used a limited subset of this data, the analyses are likely to be applicable across the entirety of Pakistan except for a few highly developed urban centers or extremely remote rural areas.

•
Education has an important linkage with the earning ability (PCI) of people in Pakistan, as is the case worldwide. The affluent population tends to aspire for greater schooling, high school, and college, and higher education enables for and offers better earning opportunities.

•
Housing types and the facilities too are dependent on household income and are a good predictor of PCI.

•
The critical energy-poverty nexus established through this work provides quantitative correlational evidence between energy and PCI at the district level in Pakistan. This correlation leads to the proposed model in Section 3, accounting for almost 94% variability in PCI.

Discussion
Notwithstanding the limitations mentioned in Section 6, the robust correlation between energy and one of the key indicators of social welfare, the per-capita income (PCI), opens venues for exploring some important dimensions of this relationship. Does this correlation fit into the existing predictors and indicators of wellness and poverty? Did energy find the right place in poverty alleviation programs in Pakistan? Should the energy type be included as an indicator in evaluating and reporting wellness/poverty/HDI and be incorporated in the development programs?
Historically, there exists a strong linkage of primary energy sources and the energy conversion industry with the state of affluence/poverty [3,5,11,20]. Based on the importance of energy in human development, energy found a central place in seventeen SDGs adopted by the UN in 2015 [39][40][41]. Energy access and poverty interdependence is well documented, and the importance of adequate energy had been identified earlier too: in the production of goods and supplies, in comfortable housing, for the provision of essential services such as health support and education, and even for the consumption of food [2,7,26,[42][43][44][45][46][47][48][49][50][51][52].
During the last four decades, a few researchers concluded that energy-poor nations would experience a steep rise in human development relative to energy consumption [2,4]. In our study, we see a reasonably strong negative correlation of firewood with average PCI at the district level (Figure 12), pointing to the fact that increased use of poorquality/inconvenient, low-density fuel contributes towards reduced income. Conversely, electric connectivity-PCI statistics indicate a positive correlation (Figure 13), meaning, thereby, that increased availability of clean and convenient energy would lead to better income. These findings closely mimic the energy-development relationship established through the earlier studies indicating the existence of the energy-poverty (affluence) nexus even at the district level in a developing country, specifically Pakistan. Thus, these graphs further validate the model arrived at through this research. The internationally accepted indicators of human development and state of wellness such as the human development index (HDI) and multi-dimensional poverty indicator (MPI) are mostly in use for measuring affluence and deprivation. Among these, HDI does not include energy as a component of human development [5], and household energy merely appears as a small fraction in MPI calculations [13,14]. The key statistic on the energy-poverty nexus brought forward through this work has shown a clear linkage of energy in measuring socioeconomic wellness in terms of PCI. Further, these statistical analyses corroborate some of the established and already in-use socioeconomic relationships such as the quality of dwelling and poverty, the households' facilities and economic wellbeing, and the education levels and economic growth of the household [2,11,26,28,29,48,49].
Lack of adequate energy access has multi-faceted impacts on the social welfare of affected families in the form of poor health conditions, drained productive time, decreased chances of value addition through quality learning, diminished productivity, and limiting the wherewithal for income generation, thereby retarding overall development [17,45,51,53]. Similarly, continuous burning of wood for cooking has serious implications for the user, the society, and the world and is not at all sustainable [44,45,53]. In developing countries, mostly women and children are constrained to collect and bring the biofuels such as wood, straw, and animal dung in the energy-constrained households [43][44][45]49,52]. Such activities take the precious time of women and children, respectively, away from income-generating activities and schooling [43][44][45]54,55]. Lack of access to adequate energy resources is also partly responsible for child-labor practices in Pakistan [54]. Additionally, the lack of clean energy sources is also the cause of the ill effects of indoor pollution, leading to 1.6 million yearly premature deaths, respiratory illnesses, eye diseases, and low-weight births [56,57]. Further, our planet, and particularly Pakistan, can ill afford the deforestation caused by firewood with serious environmental degradation and global warming impacts [23,43,44].
Millennium Development Goals (MDGs) adopted in the year 2000 were succeeded by Sustainable Development Goals (SDGs), which espoused seventeen goals and established interconnectivity between them [58]. Our statistical results and the ensuing discussion showed how most SDGs are directly correlated with SDG7, "Affordable and Clean Energy". The attainment of SDG1, poverty eradication, is not possible unless everyone has access to "affordable" clean energy. The ill effects of unclean cooking fuels are a denial of SDG3, human health. The obligation to run for pollution-heavy biomass takes children away from the critical SDG4, quality education. The data presented above and related analysis indicate a strong electricity-education correlation of 72% ( Table 4). Provision of clean water (SDG6) too has 48% dependence on electricity as per these data. The employment opportunities (SDG8) are also strongly correlated with electric connectivity (63%). Conversely, the unclean and inconvenient firewood that has adverse environmental impacts too (atmospheric degradation, deforestation, and increased carbon footprint) has a negative correlation with all these wellness parameters, highlighting the significance of affordable clean energy for other SDGs. Other SDGs were not covered during the Population Census 2017; however, SDG11-sustainable communities, SDG13-environmental protection, and SDG14/15the life on earth are directly linked with the primary energy source types. Similarly, industrial/infrastructure development (SDG9) cannot be imagined without reliable and sufficient energy as its blood line. It is feared the SDGs will be far from achievement in 2030 if the developing economies do not realize and incorporate the provision of sustainable and affordable clean energy in their development plans. The developed world and a few developing countries embraced the technological advancements and adopted policies for self-reliance in energy and for socioeconomic progress. In Pakistan, instead, the share of imported energy increased as reported by International Energy Agency, and it is no surprise that as of 2017, Pakistan's HDI was the lowest in South Asia, after Yemen, Afghanistan, and Syria: three war-ravaged countries [52]. In per-capita energy consumption, Pakistan stood at 140th as per the latest available World Bank report [41]. Pakistan's indicators on the productive use of energy as reported by the IEA are: Pakistan used 0.43 toe per 2005 s one thousand USD of GDP as against Bangladesh where 0.23 toe was consumed for the same outcome [59]. This elucidates the need for planning beyond energy supply and the importance of the productive use of energy in driving the economies towards poverty eradication.
Energy did appear as one of the nine pillars of development and poverty reduction in Pakistan's agenda too during 2012, but it did not find any resources allocated for itself in the poverty-reduction plan [60]. On the energy side, focus has been on increasing energy supply and power generation similar to most developing countries [61]. Additionally, poverty alleviation programs never incorporated energy as a poverty-reduction goal either. In the 3rd quarter of 2019, Pakistan's electric power generation surpassed the demand by many thousand megawatts, and the government was looking for "plans to utilize the surplus energy" [61]. Conversely, a repeated raise in electric tariffs is poised to reduce electricity demand further [62]. Unconsumed surplus power generation, continuously rising circular debt, decreasing electricity demand owing to price escalation, heavy financial drain in subsidies, and heavily import-dependent power plants highlight the need for comprehensive outlook towards energy planning, its management, and governance in Pakistan. Not incorporating energy in poverty alleviation programs is likely to remain a major contributor to unachieved poverty alleviation and socioeconomic development goals.
Our findings validate the fact that households' fuel choices are dependent upon the socioeconomic conditions of the population [52,63,64]. This study does not assume a causal relationship between energy (and other explanatory variables) and the PCI except for the fact that life needs are met with physical and other resources, including energy. In line with the experts' opinion, there is also strong statistical evidence of the fact that the districts with more households using unclean and inconvenient source(s) of energy have much lower per-capita income [1,50,53]. Another critical dimension of energy usage is the outcome: whether it adds to the financial burden or is a resource for improving economic wellbeing [18,50,53]. The energy-income correlation shows a two-way interaction between energy and socioeconomic wellness, i.e., the energy is the means as well as the end [18,53]. Sustainable programs for poverty eradication and socioeconomic development through clean and affordable energy can be constituted based on relevant data only. The input data must cover all aspects of two-way interaction between energy and poverty so as to enable policy formulation and establishment of sustainable programs for poverty alleviation and development [63,64].

Limitations
Data-driven inferences are dependent critically on the processes of data collection and subsequently on the quality of the collected data. Additionally, no inference is valid unless mediated through local sociocultural environment. Based on the knowledge gained through the literature review and the sociocultural dynamics of Pakistan, certain limitations pertaining to available data used in this study are highlighted. The Census 2017 was a population survey and not an energy-related survey, although it captured predominant energy used in each household. The Census 2017, thus, provides, among other variables, information on the number of households using different type of energy for cooking and for lighting. However, information on the use of electricity for lighting is based on connectivity and does not cover reliability, quality, and affordability aspects of electrical energy. While according to the IEA's/IRENA's "The Energy Progress Report 2019" [41], Pakistan is the 4th largest unserved population, with another 144 million confronting reliability problems owing to frequent power outages. Data on the amount of energy consumed and the proportion of household income spent on energy are also not available. Households in Pakistan, similar to other developing countries, are constrained to use multiple sources of energy instead of relying on one, owing to inaccessibility to clean/high-density energy or unreliability of primary energy source [23,43,57]. Although the use of mixed fuels phenomenon remains unevaluated, the data used in this study cover the entire population of Pakistan, which may not be possible while conducting a customized survey. Thus, despite few limitations, the outcome of this study opens new avenues for taking a fresh look at the energy and its impacts on the poverty in Pakistan.

Conclusions
This study is a preliminary attempt to examine quantitatively the energy-poverty interplay in Pakistan with a possible extension to other developing countries. The study is unique, as it utilized detailed real data for Pakistan never used in any study thus far. The proposed regression model highlights the energy-PCI statistical correlation and energy's impact on aggregated household PCI. Since the study uses the final energy (wood and electricity) available to the households as two of the explanatory variables, it may be compared and contrasted with total energy input to the society in a country-a phenomenon used in earlier studies for exploring energy linkage with wellness. In addition to the education levels (a typical matrix linked to poverty) and other economic wellness indicators, the proposed model highlights a close connection between energy and poverty, thereby augmenting the education-only model and suggesting it as one of the reliable factors to predict PCI. The model provides quantitative evidence on how a lack and/or availability of clean energy sources affects earning abilities and the income aggregated at the district level in Pakistan.
The critical energy-poverty nexus established through this work should help in better understanding the sustainability requirements and provide suitable guidelines for data collection and dissemination, as the availability of reliable data is critical in data-driven planning and implementation. This preliminary work may provide impetus to (1) the customized/focused data collection to fully explore the energy-poverty nexus in Pakistan; (2) creating synergy in energy planning and poverty alleviation programs, i.e., poverty mitigation through the adoption of clean and convenient renewable energy options; (3) drawing the researchers' and policy makers' attention to consider energy as an important contributory factor in human development and incorporate it as a parameter in calculating HDI; (4) invoke researchers' interest in further investigation into energy-poverty interplay in Pakistan and further studies comparing socioeconomic wellbeing between the communities with access to clean energy and those utilizing low-density unclean energy sources. Institutional Review Board Statement: The aggregated data for this research came from the Census 2017 campaign, Pakistan Bureau of Statistics. Although the data is not yet published fully, it is considered in public domain. No individual human subjects are identified in the census data. Therefore, IRB approval is not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Data for conducting this research have been obtained from the Pakistan Bureau of Statistics. A summary of the compiled data is available on https://www.pbs.gov.pk/ content/population-census (accessed on 10 April 2021). Detailed data have been released on special request, with restricted use instructions, and are held with the first/corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.