1. Introduction
Ensuring high standards of living for today’s society is the overarching goal of sustainable development policies at the national (central administration), regional and local (LGU) management levels. This is because sustainable development, as a concept that meets the current needs of humanity, is an economic doctrine committed to aligning the population’s standards of living with the current level of human development. The main focus of sustainable development measures at the regional and local levels is on the residents; this is why improvements in their standards of living can tell us much about the region’s sustainable development. On the other hand, having enough funds is a decisive factor in whether or not local government units are capable of fulfilling the numerous tasks entrusted to them by the central administration (it has an impact on such aspects as the quality of public services and the quantitative and qualitative conditions of the social and economic infrastructure). When in a more advantageous financial situation, local government units are better positioned to implement investments that capitalize on favorable economic, social and environmental developments, which, as a consequence, can translate into higher standards of living for the local population. However, it would be difficult not to agree with P.S. Morrison, who notes that while human wellbeing depends on local conditions, individuals differ in how they perceive the impacts of changes driven by local government investments [
1]. Indeed, local authorities are the ones responsible for sustainable development on many fronts (including social or economic aspects) and for using the region’s potential correctly in order to address as far as possible the needs of local residents. After 1989, as a result of political transformation, Polish local government units were granted financial resources and other assets. The decentralization of public functions somehow forced, and continues to force, local government units (especially communes) to put in place a financial policy which ensures these functions are performed to the greatest possible extent and at the highest possible level. According to the European Charter of Local Self-Government (Article 9), “local authorities shall be entitled, within national economic policy, to adequate financial resources of their own, of which they may dispose freely within the framework of their powers” [
2]. Hence, the availability of financial resources is fundamental to the functioning of local government units and a condition for whether their statutory tasks can be performed. Unfortunately, Polish local government units (especially communes) are increasingly obliged to rely on external funding sources. This is explained by the need not only to finance a number of investments designed to enhance the standards of living for the population but also to cover running costs. B. Daffloon and K. Beer-Toth point out the fact that local government debt drives the modernization of the local economy and the creation of jobs. The authors suggest that for local government units, debt makes it obviously easier to fulfill their tasks while also providing alternative financing streams if sufficient funds are not available [
3]. X. Jing notes that over the last 40 years of development since the reform, Chinese local government units have accessed funds by issuing public debt and made a positive contribution to the implementation of social infrastructure projects and to the joint development of urban and rural areas [
4]. Local government debt can have different consequences—from triggering a leverage effect to the repayment of previous liabilities or bankruptcy. Poland saw the latter scenario for the first time on 1 January 2019; due to its debt, the Ostrowice commune (located in the Zachodniopomorskie voivodeship) was dissolved and incorporated into the administrative territories of two neighboring communes (Drawsko Pomorskie and Złocieniec).
Despite numerous analyses, the quantification of living standards as well the identification of aspects that contribute to improvements in this area (which are significantly determined by how much material, immaterial, individual and collective needs are met) are problems yet to be fully resolved. The difficulty lies in selecting the diagnostic variables, relevant measuring methods and ways of identifying relationships between the categories considered.
Obviously, some literature exists that addresses the issue of LGU financing and the population’s standards (or quality) of living (including N. Hlepas [
5], B. Oleszko-Kurzyna [
6], Cárcaba A. et al. [
7]) or local development (including H. Pondel [
8], Stanny, Strzelczyk [
9]). However, in the context of the multifaceted nature of these categories, there is a scarcity of papers that rely on appropriate multidimensional econometric methods.
The purpose of this paper is to identify the multidimensional relationships between the population’s standards of living and the financial situation of Polish communes. Additionally, an another scientific goal for the author is to promote the use of canonical analysis, which is relatively rarely used in economic sciences. However, instead of providing a complete explanation of this calculation method, focus was placed on its usefulness. The empirical analyses were based on data acquired by the author from a survey with 241 presidents and vice-presidents of commune councils in Poland (the project was financed with resources of the National Science Centre (NCN-Poland), allocated under project DEC-2021/05/X/HS4/00137). Keeping in mind that the standards of living are determined by many individual and exogenous factors, the author focused solely on analyzing how they are impacted by the financial situation of local government units. Both the financial situation of local government units and the population’s standards of living are of a multifaceted nature, and this is the reason for making use of the canonical analysis, a sophisticated approach to multidimensional statistical techniques.
2. Theoretical Aspects of the Role the Financial Situation of Local Government Units Plays in Creating the Standards of Living for the Population
According to A. Zeliaś [
10], increased interest from researchers in the standards of living, viewed as an economic category, can be explained by the transition from being fascinated by technical and economic advancements to reflecting on the benefits and threats brought about by human progress. The adverse phenomena involved in economic growth include [
11,
12]: accelerated environmental degradation threatening human and animal life; considerable increase in the incidence of and premature mortality from some diseases, mostly those referred to as diseases of affluence (including cardiovascular diseases and cancer); increase in social pathologies (including frustration, crime, alcoholism); rapid increase in the number of traffic accidents and accidents at work; the confusion of value systems; the dismantling of old systems which are not replaced by the establishment of new ones; the widening social gaps in different dimensions, etc.; excessive consumption of products and services leading to environmental pollution; and the inability to create substitutes for non-renewable resources without increased risks for humans and the environment.
Despite the availability of numerous books and papers, the literature on the subject still fails to provide a single, commonly accepted definition of “standards of living”. A UN expert committee defined them in 1954 as “the overall actual living conditions of humans and the degree to which their physical and cultural needs are met through a flow of goods and services, whether paid or derived from social funds” (after: [
13]). This became the starting point for defining and analyzing “standards of living”. The definitions formulated by successive authors included both narrow ones which focused on measurable phenomena and broad ones which took into account life aspects that are difficult to gauge. It seems that the definitions most frequently referred to in the Polish literature are those proposed by A. Luszniewicz [
14], who views standards of living as the degree of meeting (securing) households’ cultural and material needs with flows of goods and services they pay for and with collective consumption flows; and by Bywalec and Rudnicki [
15], who consider standards of living to be the degree to which needs are addressed by the consumption of human-made material and immaterial goods. Hansen and Grubb view living standards as the happiness or utility that can be derived from consumption. In that context, consumption can just be generally defined as any activity, status or good which an individual can acquire. According to B. Chan Yin Fah, living standards are related to the consumption or use of economic goods, and he views them as the sum of food, fuels and other perishable goods purchased, domestic work, vehicles, clothes, other goods of different durability and human-made services which are used by an individual or a group over a defined period [
16]. A broad approach to standards of living was proposed by S. Kalinowski, who defines them as the “system of synthetic indicators resulting from the level of wealth manifested in how the physical and intangible needs are met and, as a consequence, in the economic capability, commitments and aspirations of individuals”. Viewed from that angle, the level of wealth is related to the amount and sustainability of revenue, being able to buy goods in accordance with one’s needs and being able to repay one’s debt. In turn, when quantifying this phenomenon, the author recommends an approach which consists in combining measurable, immeasurable, objective, subjective, quantitative and qualitative characteristics [
17]. According to O.G. Okafor [
18], standards of living are determined by a variety of factors, such as revenue, availability of jobs, class differences, poverty indicators, quality and availability of housing, number of working hours converted into essential necessities, Gross Domestic Product, inflation rate, access to healthcare services, quality and availability of education, life expectancy, disease incidence, cost of goods and services, infrastructure, organic economic growth, economic and political stability, political and religious freedom, environmental quality and security level.
“Standards of living” is a term largely based on the theory of needs. Essentially, social statistics uses four categories (some authors restrict their considerations to three terms: quality of life, living standards, dignity of life; for more information, see [
12]) which are widely recognized by researchers and refer to the degree to which the needs are addressed:
Living conditions: the objective infrastructural conditions the society lives in. They are mostly related to the material situation, and to securing the existence and environment for individuals.
Living standards: the extent to which cultural and material needs are met by the existing infrastructure that enables them to be addressed.
Quality of life: all aspects of an individual’s life that are related to his/her existence, experiencing different emotions and being someone.
Dignity of life: not experiencing deprivation which could negatively result in the population spending their lives in changing economic realities. This includes the financial situation of households and the immaterial aspects of living.
The definition by A. Luszniewicz, as referred to earlier in this paper, was used in the empirical studies carried out below. The author believes it best represents the essence of this economic category by emphasizing the importance of the degree to which human needs are met.
Obviously, a part of human needs (which keep growing and evolving in line with cultural and economic transformation) can only be addressed on an individual basis. However, certain needs are met with public resources (e.g., the need for security and order in the surroundings, social assistance), through the direct or indirect activities of central or local government authorities. Indeed, the environment where the local population lives is a local-level social system where the residents spend nearly all of their time. Because certain needs of local dwellers are addressed with the use of public systems, LGUs and related authorities are somehow pushed to focus on the allocation of financial resources and on collecting enough funds to cover these expenses. Hence, the financial situation at local government level plays a special role in this context.
All around the world, local government units play a key role in enabling development and improving the standards of living for the population. Increasingly often, robust management mechanisms are put in place, a civic society develops even in places where it was historically weak, and local government units are reinforced to act in more and more open and responsive ways [
19]. Compared to other countries, Polish local government units are a powerful figure in the national political system, which is reflected by a large proportion of GDP being redistributed though local government budgets (reaching 14.0% in 2019). Of all Union member states, only three Nordic countries reported a higher share of local financial resources in GDP. When looking at the composition of public expenditure in Poland, it turns out to be even more decentralized than in other countries. Indeed, the share of the local government finance sub-sector in central and local government expenditure was 33.9% in 2019. Higher ratios were only found in Denmark, Sweden and Finland [
20]. In late 1900s, most European countries (including Poland) witnessed a strong decentralization trend which translated into authority being transferred from the central government level to lower (local) levels. In Poland, the local government was brought back to life in 1990, but only at the commune level. However, this provided grounds for the development of local autonomy in Poland. In 1999, the Polish local government shifted to a three-level structure; currently, the local level is represented by 2477 autonomous communes and 380 districts (the equivalent of the German and Austrian
Kreis and the Czech and Slovakian
okres), whereas the regional (NUTS-2) level comprises 16 voivodeships. Each of the above units has a scope of statutory tasks and owns financial resources allocated to support their performance. However, the legislator defined different revenue streams for each local government level. In this context, communes are in the most advantageous position since their budgets are fed with a large portion of resources, and their authorities can co-decide the amount and allocation of available funds. However, the relation between financial resources and tasks entrusted to local government units (as a consequence of the decentralization of tasks previously owned by the central government) gives rise to endless controversies among local authorities. Successive reforms gradually extended the scope of statutory tasks of local government units but often failed to ensure a consistent increase in the funds allocated to them. The financial situation of LGUs limits their capacity to implement investments that drive development without restricting the fulfillment of their current tasks. In Poland, the Local Government Act for Commune Level of 8 March 1990 sets forth several obligations of the local administration. By initiating, coordinating and performing these obligations, the LGUs are supposed to address the collective needs of their residents. In accordance with Article 7 of the above Act, “addressing the collective needs of a community is among the commune’s own tasks” [
21]. This includes, without limitation, issues related to healthcare, culture, public education and orderly development. Supra-communal tasks laid down in dedicated acts are the responsibility of district authorities [
22]. In the context of the above considerations, crucial tasks at the district level include stimulating the local labor market and fighting unemployment.
Certainly, there are other tasks of commune-level LGUs which directly or indirectly contribute to addressing the residents’ collective needs. While they are not set out in dedicated regulations, they are implied by the very essence of self-government, and include: developing local-level training and education systems; offering subsidies, incentives and discounts; creating a “space for working” (which includes support for the establishment of business incubators); creating various facilities and amenities; and identifying local needs.
Indisputably, the quality of public services provided by local government units as part of the tasks entrusted to them, and the quantitative and qualitative conditions of essential socioeconomic infrastructure used in addressing the whole range of the local population’s needs (and contributing to socioeconomic development) depend on the local government’s financial situation. The above boils down to whether local government units can be provided with financial security which has an effect on their capacity to fulfill their tasks. Their financial condition is affected by the amount and structure of revenues, by their ability to make proper use of repayable resources and by how efficient they are in accessing extra-budgetary funds. In that context, the literature on the subject uses some strictly interrelated terms, such as “financial soundness” (e.g., E. Padovani, E. Scorsone [
23], V. Pina et al. [
24]), “fiscal soundness” (e.g., C. S. Maher, S. C. Deller [
25], T.M. Mamun, S. Chowdhury [
26]) or “financial situation” (e.g., R. C. Casal et al. [
27], D. Prior et al. [
28]). However, the author will not use them interchangeably.
The literature on the subject views the financial situation of local government units as the capacity to finance services with collected revenues in given socioeconomic and institutional conditions, or as the local government’s ability to generate enough financial resources necessary to meet their obligations in a given period [
29]. A similar definition is provided by I.T. Ritonga et al. [
30] who sees it as the “local government’s capacity to timely meet their financial obligations and to maintain the level of services delivered to the community”. According to L. Osowska and A. Ziemińska, the financial situation of local government units (or communes, more specifically) means their financial status in a defined time interval. The financial situation is reflected by a number of aspects, including the ability to perform their tasks, have a balanced budget and increase their assets. The financial situation of communes certainly is a complex phenomenon, and includes: the level of revenues, financial autonomy, investment volumes, the capacity to access extra-budgetary resources and the financial results [
31]. In turn, M. Stanny and W. Strzelczyk place the financial situation of local government units in the context of financial security. They define it as the ability to meet the unit’s financial obligations while ensuring continuous delivery of local government services, which has a direct effect on the performance of statutory tasks and on improvements to the residents’ living standards. The continuity is the consequence of a balanced budget, financial autonomy, independence from capital transfers, financial liquidity, long-term solvency and prudent expenditure of public funds [
9]. Viewed by M. Jastrzębska, the financial situation can be defined as [
32]:
the ability to deliver services at a level no worse than the existing one;
the ability to access repayable and non-repayable financial resources to fulfill future tasks;
the ability to prevent entrepreneurs from moving to other LGUs;
the ability to address the challenges of the future economic situation.
The literature on the subject pays little attention to the local government units’ financial situation as a determinant of standards (or quality) of living for residents. Examples include research by N. Hlepas [
5], who analyzed the relationship between the local community’s social cohesion and their satisfaction with local and central government institutions, as well as papers by M. Malinowski and J. Smoluk-Sikorska [
33], who (based on spatial regression models) examined the spatial relationships between the financial situation of Polish LGUs and the population’s standards of living at the district level. In turn, based on the data collected in 76 Spanish towns in 2008–2010, B. Cuadrado-Ballesteros et al. [
34] analyzed the relationship between “financial soundness” and quality of life, and demonstrated that residents of financially sound communes enjoy a greater quality of life than others. The relevant literature seems to be dominated by papers that analyze the differences in the standards of living between aggregation levels. Examples include the works and findings by: S. Kalinowski [
17], who analyzed such aspects as the differences in poverty levels of Polish rural residents with precarious incomes; A. Zeliaś [
10], who presented a comprehensive typology and linear ordering of EU countries by standards of living; and L. Zhou et al. [
35], who used the TOPSIS method to linearly order the main cities of the Guizhou province by standard of living of the local population. A. Carmelli identified the empirical relationships between fiscal conditions of Israeli local government units in 1997 and 1998 and the levels of education and employment in 2001. It follows from his study that preexisting fiscal conditions of LGUs had a significant impact on the future development of education and employment which ultimately affect the population’s standards of living [
36]. As part of their study, Y. Cho and K-Y. Lee identified the relationship between satisfaction with public services, trust in local government, and social satisfaction. They carried out a survey with 980 residents of Jeonbuk, Korea, and used structural equation models (SEM) to demonstrate that satisfaction with public services has a direct or indirect impact on the satisfaction of local residents. In particular, satisfaction with public security had both a direct and an indirect effect on how satisfied the community was [
37]. In this context, it is also worth mentioning some interesting findings from research by J. Stokes et al. on the impact of spending cutbacks in 147 local government units in England on the multimorbidity of the population (which also affects the standards of living). It turns out that cutting spending on services by 1% per capita entails a 0.1% increase in the incidence of multimorbidity. Taking the budget classification into account, a 1% decrease in spending on public health entailed a 0.15% increase in the incidence of multimorbidity, whereas a 1% decrease in spending on adult social care resulted in a 0.01% decrease in the average health-related quality of life [
38]. Additionally, another study worth mentioning is that of by B. Siregar and N. Pratiwi, who used 1003 financial reports of Indonesian local government units for the years 2009–2013 to demonstrate (using regression models estimated with PLS, partial least squares) that the administrative age of LGUs (the time elapsed since the establishment of a unit under the relevant act), their status and the number of authorized financial management employees have a positive and significant impact on the local government’s financial self-reliance. At the same time, financial self-reliance at the local level has a positive and significant impact on the Human Development Index (HDI) [
39].
3. Materials and Methods
A survey questionnaire was the research tool used in this study. Only a selection of research areas was analyzed for the purposes of this paper. The survey was carried out between early January and late February 2022 with presidents and vice-presidents of commune councils (the legislative and control authority of local government at the commune level) based on the CATI (computer-assisted telephone interviewing) approach.
The research sample should be a reliable representation of the population. The greater the proportion of the population in the sample, the higher the representativeness. As the sample size gets closer to the population size, the result becomes more reliable (also, the greater the research sample, the higher the precision of estimators) [
40]. The size of the sample covered by this study was calculated with the finite population formula structured as follows:
where
P is the estimated proportion in the population (usually set at 50%);
n is the sample size;
N is the population size;
e is permissible error; and
Zα is the value resulting from the confidence interval used (equal to 1.96 for a confidence level of 95%).
The total size of the respondent group is N = 2477 (number of communes in Poland). The maximum permissible error and the confidence interval were set at 6% and 95%, respectively.
With this data, the minimum size of the research sample is 241 (
Table 1). Generally, the sampling can be performed in three ways: as purposeful sampling, by composing the sample of voluntary participants, or by random sampling. The stratified random sampling method (which takes account of the population’s heterogeneity) was employed to ensure representativeness across the whole population of presidents and vice-presidents of commune councils in Poland.
Stratified sampling means dividing the whole population into strata and independently drawing a defined number of elements from each of them. The stratified random sampling method was used in building the sample pro rata to the number and type of communes, based on the known structure of the total population (number of communes in voivodeships, commune types). The size of the random stratified samples was calculated in proportion to the size of the respective strata (
Table 2).
The table below shows the distribution of the sample between voivodeships (
Table 3).
Due to the multidimensional nature of aspects covered by this research, it is not recommended to use a single indicator to represent both the financial situation of local government units and the standards of living of their residents. Hence, the analysis covered two sets of variables.
A total of 21 variables were used in determining the standards of living of the population (1: very low, 7: very high) cf. [
13,
42,
43,
44,
45]: S1: availability and quality of public healthcare services; S2: public security level; S3: conditions for enjoyable ways of spending free time; S4: education conditions for children and youth (primary schools); S5: ability for adults to improve their skills; S6: availability of nurseries; S7: cleanness of the natural environment; S8: noise (the lower the variable, the greater the noise nuisance); S9: general esthetics of buildings and green areas; S10: quantitative and qualitative condition of roads; S11: communication links (including buses and trains); S12: street lighting; S13: development level of water and sewage infrastructure; S14: development level of the gas network; S15: development level of sports and cultural services; S16: development level of the telecommunications network (including mobile network coverage, broadband Internet access); S17: rating of tourism facilities (motels, hotels, agri-tourism); S18: situation in the local labor market (wage level, the ease of finding a job); S19: availability of selected services (hairdresser, beautician, shoemaker, tailor, home appliance repair); S20: shopping conditions (including the total number and diversity of shops, availability of discount stores); S21: social security (ability to obtain financial and non-financial support from public institutions).
In turn, 35 variables were used to assess the financial situation of communes, ranked from 1 (not significant) to 7 (extremely significant) cf. [
30,
46,
47,
48]: FC1: rank of education expenditure in the budget; FC2: share of public administration expenditure in the budget; FC3: share of healthcare expenditure in the budget; FC4: rank of social assistance expenditure in the budget; FC5: share of expenditure on water and sewage infrastructure in the budget; FC6: share of public security expenditure in the budget; FC7: share of sports and leisure expenditure in the budget (including community centers, tourism infrastructure); FC8: share of environmental protection expenditure in the budget (including waste sorting, removal of asbestos); FC9: share of expenditure on transportation and communications infrastructure in the budget; FC10: investment expenditure per capita; FC11: share of expenditure on assets in total expenditure; FC12: share of agricultural tax in the budget; FC13: share of forestry tax in the budget; FC14: share of property tax in the budget; FC15: share of car tax in the budget; FC16: share of local fees in the budget (marketplace fee, tourism fee, visitor’s tax, dog tax, advertising fee); FC17: total revenue per capita; FC18: own revenue per capita; FC19: current transfers per capita; FC20: share of current revenue in total revenue; FC21: share of own revenue in total revenue; FC22: share of revenue derived solely from personal and corporate income taxes in total revenue; FC23: share of total subsidies in total revenue; FC24: share of earmarked subsidies in total revenue; FC25: share of funds derived from EU resources in total revenue; FC26: self-financing ratio of the commune (the degree to which the local government unit finances its investments with own funds); FC27: share of operating surplus in total revenue; FC28: share of operating surplus and of revenue from property sold in total revenue; FC29: total debt of the commune; FC30: total liabilities per capita; FC31: ratio of total liabilities to total revenue; FC32: ratio of long-term liabilities to total revenue; FC33: ratio of debt servicing expenses to total revenue; FC34: ratio of debt servicing expenses to own revenue; FC35: share of maturing liabilities in total liabilities.
A canonical analysis was performed to present the multidimensional dependencies between the sets of variables proxying for the financial situation of communes and for the population’s standards of living. With the canonical analysis, the assessment of dependencies between the two initial sets of variables (explanatory variables {X1, X2, …, Xp} and explained variables {Y1, Y2, …, Yq}) boils down to analyzing the relationships between latent variables. These new latent variables are a specific type of synthetic indicator of the correlation between the two sets, calculated as the weighted sum of the variables of the sets considered, i.e.: a1X1 + a2X2 + … + apXp and b1Y1 + b2Y2 + … + bqYq.
The essence of the canonical analysis is to look for pairs of linear functions which meet three cumulative conditions [
49]:
are an approximation of sets of variables X and Y;
are maximally correlated, i.e., express the degree to which the set Y is statistically determined by the set X;
express maximum independence between pairs of canonical variates, i.e., take account of the particularity of variance explained by successive pairs of canonical variates.
Meeting the maximum correlation condition means the weighted sum pairs can be considered a fair representation of initial data of the model used in the study. A weak or non-existent correlation would reflect the actual absence of relationships between the sets considered. Maximum correlation is sought based on the indeterminate Lagrange multipliers method cf. [
50,
51,
52,
53,
54,
55]. The study considers a system of two random variables with
as the vector of explanatory variables and
as the vector of explained variables. The canonical analysis seeks to maximize the canonical correlation expressed as:
where
Rxx is the correlation matrix for explained variables;
Ryy is the correlation matrix for explanatory variables;
Rxy is the correlation matrix for both types of variables;
wx,
wy are the weights for first-type and second-type canonical variates; and
rl is the canonical correlation coefficient.
The results of a canonical analysis are sensitive to atypical values (outliers) which can contribute to an erroneous picture of the research area covered by the study (an outlier can affect the numeric value of relationships between variables, suggesting for instance that they are highly correlated in a situation in which there is no correlation). Therefore, the three-sigma rule cf. [
56] was used to identify atypical items in both sets under consideration. Accordingly, items which fall outside the interval [mean − 3*standard deviation; mean + 3*standard deviation] should be removed from the initial set of variables. Other ways of identifying outliers include the analysis of dispersion graphs.
If identified, outliers were replaced with mean values for regions (NUTS-2 level) which are home to units with sub-variables outside the defined thresholds (mean ± 3*standard deviation). In these analyses, the above procedure needed to be used four times for the set of variables relating to the standards of living (including three times because the values are below the lower boundary of the defined interval and one time because the value is above the upper boundary), and 19 times for the communes’ financial situation (nine times because the values are below the lower boundary and 10 times because the values are above the upper boundary).
The total number of pairs of canonical variates is equal to the minimum number of variables in any of the sets considered. Hence, it seems crucial to determine how many pairs of canonical variates exist which need to be subject to an in-depth examination. This can be accomplished using the significance test of canonical correlation coefficients, with the null hypothesis being the absence of a relationship between two sets of input variables. The significance of pairs of canonical variates is verified with the Wilks’ Ʌ (Wilks’ lambda) test statistic, expressed as follows for a set of
s–
k variables [
57,
58]:
with
s as the number of canonical roots;
k as the number of canonical roots removed; and
as squared coefficient of canonical correlation for the canonical variate
l.
Under the assumption that the null hypothesis is true, the above statistic follows the probability distribution of the Wilks’ Ʌ with n − 1, p, q as the parameters.
For each of the generated canonical roots, the canonical analysis allowed to calculate variances extracted, an indicator which shows the percentage of variance of input variables explained by the canonical variates. It is the sum of the squared factor loadings of each variable found in the set corresponding to the canonical root concerned divided by the sum of input variables. The mean variances calculated this way can be expressed with the following formulas:
or
with
q as the number of input variables;
cjl as the canonical factor loading for base variable
j and first-type canonical variate
l; and
djl as the canonical factor loading for base variable
j and second-type canonical variate
l.
Additionally, the canonical analysis included finding the redundancy index (also referred to as the compound coefficient of determination) expressed as the mean variances (calculated above) multiplied by squared canonical correlations. It specifies the amount of mean variance in a set explained by a canonical variable with another specific set of variables, and can be presented in its analytical form:
or
where
λl: is the characteristic root of the matrix of squared canonical correlations.
The whole study used a single level of confidence α (0.05) and addressed only those “categories” for which the p-value was below the defined level of confidence.
Canonical analysis is a method that requires the assumption that the set of variables subject to it follows a normal distribution. Due to the difficulty in ensuring that normal distribution is followed by every variable covered by the study, in the context of economic phenomena it is more reasonable to use the canonical analysis for descriptive purposes rather than for statistical inference.
A certain alternative would be to ignore the results of examining these assumptions and to process the data as if it were distributed normally. However, such an approach could lead to erroneous outcomes. Another option is to transform the data to bring its distribution closer to a normal model. Although many research projects do not accord great attention to it, the importance of transformation is often appreciated in spatial analyses [
59].
In both sets under consideration, the normality of the distribution was examined using the results of the Shapiro–Wilk test. Conceived by S.S. Shapiro and M.B. Wilk [for more information, see [
60]], it is one of the best normality tests and demonstrates great robustness even in the case of large samples (however, if larger than 2000, other procedures are recommended, e.g., the Lilliefors test [for more information, see [
61]]). In this test, the statistic is a random variable expressed as:
where
are the constants which can be found in the dedicated tables for that test.
Its statistical significance is verified by testing the following hypotheses: H0: F(x) = F0(x), with F0(x) as the normal distribution function, versus the alternative hypothesis H1: F(x) ≠ F0(x).
If some variables are identified that fail to follow the normal distribution, the Box–Cox transformation is used to make an approximation of the normal distribution as follows [
62]:
The λ transformation parameter is selected based on the maximum likelihood estimation.
4. Results
According to the survey carried out with the presidents and vice-presidents of Polish communes, most of them (66%) view their local government unit as moderately wealthy. Only less than 9% of respondents considered their unit to be wealthy or very wealthy (
Table 4).
In assessing particular aspects of the standards of living in communes, the development level of the telecommunications network (5.07) and the general esthetics of buildings (4.93) were attributed the highest ranks (on a scale from 1 to 7). Conversely, the interviewees viewed the availability of nurseries (3.01) and the availability and quality of public healthcare services (3.29) as poor (
Table 5).
Improving the availability and quality of public healthcare services (43.6%), extending the roads and improving their quality (38.2%) and improving the cleanness of the natural environment (32.4%) were given the highest priority in the context of improving the standards of living. In turn, improving the shopping conditions and increasing the number of publicly owned apartments were considered to be the least important measures (each with 21.6%). What needs to be noted is the importance of environmentally focused measures. Measures taken to improve the cleanness of air and initiatives focused on supporting renewable energies were found to be very important or a priority by more than 75% and more than 85% of respondents, respectively (
Table 6).
The canonical analysis was carried out to provide more in-depth insights. As a tool employed in multidimensional comparative analyses, it enabled estimating the relationships between two pre-selected datasets (relating to the commune’s financial situation and the residents’ standards of living). It was then used to determine the extent and direction of relationships between the two sets of variables relating to these aspects. The null hypothesis formulated in the canonical analysis claimed the absence of relationships between the sets of variables (meaning that each canonical correlation is zero). If the null hypothesis in the above wording is rejected, the assumption is made that at least the first pair of canonical variates generated (the one with the highest value) is statistically significant. The statistical significance of these canonical variates is verified with the Wilks’ lambda test, a sequential procedure which initially takes all of them into account. Then, in successive steps, it tries to reject the hypothesis on the absence of relationships between the datasets by ignoring the co-variability reflected by the first
k canonical correlations (
Table 7).
As mentioned earlier, the total number of canonical roots generated always corresponds to the minimum number of variables covered in one of the sets under consideration. In this case, there are 21 canonical roots, which is explained by the size of the set of variables describing the population’s standards of living. The first pair of roots generated by the canonical analysis—which provides a synthetic description of interactions between the sets of variables relating to the financial situation of communes and the population’s standards of living—explains most relationships between them. Hence, research practice places the greatest emphasis on the correlation found in the first canonical variate. In this context, P. Churski [
63] claims that of all the estimated coefficients of canonical correlation, only the first one (the one with the highest value, which relates to the strongest relationship between the combinations of dependent and independent variables) should be selected. However, it is important to note that the first pair of canonical variates fails to fully explain the relationships between the variables considered. As a consequence, it is worthwhile to determine subsequent pairs of canonical roots as they explain the relationships in other (less significant) dimensions. The roots generated by the canonical analysis are not correlated with each other (as they explain the relationships between sets of input variables in other dimensions), and are meant to explain the increasingly smaller variation. Note also that the canonical correlations will be increasingly smaller. The author of this paper believes that an in-depth analysis should be carried out on all statistically significant canonical variates (there are six of them) because they might contribute with relevant information on the co-variability between datasets considered.
The canonical correlations were arranged with values in descending order (
Table 7). Note that the canonical correlations cannot be interpreted in the same way as the classical (e.g., Pearson) correlation. They are correlations between weighted sums in each set, with the weights being calculated for successive canonical variates.
The highest canonical correlation was over 0.93 (the Wilks’ lambda test used to verify its significance was 0.0006). For the second statistically significant canonical variate, it was nearly 0.84. The values for successive canonical variates were much smaller. The sixth (last statistically significant) canonical variate had a correlation coefficient of slightly above 0.57. The calculated value of the canonical correlation coefficient determines the level of confidence in the coefficient of compound determination. A high correlation between sets X and Y is considered to provide a liable basis for accepting the canonical determination, whereas a small or statistically insignificant value of the canonical correlation coefficient does not provide sufficient grounds. The absence of a canonical correlation would mean either that the model structured in the study is inadequate or that actually no relationships exist between the sets under consideration. Squared canonical correlations measure the degree to which linear relationships explain the variation in one set of variables by the second input set with successive pairs of canonical variates. For the first, second and last statistically significant canonical variate, squared canonical correlations are almost 0.87, almost 0.70 and 0.33, respectively. It can therefore be assumed that the model developed in this study provides quite an adequate description of the datasets considered. The high and statistically significant values of canonical correlation for the first six pairs of canonical roots mean that the linear models used are an exact description of both datasets relating to the communes’ financial capacity and the residents’ standards of living. Other pairs of canonical variates identified are not correlated with each other in a statistically significant way, and therefore (as mentioned earlier) are ignored in further description and interpretation.
This study intended to examine the structure of relationships between sets of sub-variables describing the communes’ financial capacity and the residents’ standards of living. The canonical weights (see
Table A1) developed for these sets of variables make it easier to explore the structure of canonical variates by showing the contribution each variate makes to the weighted sum (this is often interpreted similarly as the
beta coefficients in multiple regression).
It follows from the calculations that for the most statistically significant canonical variate, the greatest (absolute) weights are found in variables S10 (0.5870) and FC10 (0.7926). Based on the above, it can be assumed that the investment expenditure per capita and the quantitative and qualitative conditions of roads contributed the most to the first canonical variate. When it comes to determining the second statistically significant canonical variate using the sub-variables covered by the study, the greatest contribution was recorded for FC3 (−0.7936) related to the share of healthcare expenditure in the budget and for S1 (−0.7445) related to the availability and quality of public healthcare services. In the case of the third canonical variate, the greatest absolute values of weights were identified for sub-variables FC18 (−0.5500) related to the amount of own revenue per capita and S10 (−0.4233) representing the condition of roads in the unit concerned. In turn, the greatest contributors to the fourth canonical variate were FC17 (−0.6522), the variable that represents total revenue per capita, and S16 (0.5216) related to the development level of the telecommunications network. Regarding the fifth canonical variate, the greatest weights were identified for sub-variables S3 (0.8609) referring to the conditions for enjoyable ways of spending free time and FC7 (0.6426) which reflects the share of sports and leisure expenditure in the budget. Finally, for the last statistically significant variate, the greatest canonical weights were identified in variable FC18 (−0.6762) relating to own revenue per capita and in variable S13 (0.8144) which describes the development level of water and sewage infrastructure.
To dive deeper into analyzing the structure of canonical roots, this study also calculated the values of canonical factor loadings which are equated with coefficients of correlation between a canonical variate and input variables. The higher the (absolute) value of a factor loading, the greater should be the importance attached to that variable when interpreting the relevant canonical root. The literatures on the subject are not unanimous regarding the critical value of factor loadings for particular variables that need to be subject to an in-depth analysis. J. Zwierzchowski and T. Panek [
64] suggest interpreting the variables for which the squared coefficient of correlation exceeds 0.50. Conversely, G. Więcek and A. Sękowski [
65] believe that only the variables with a factor loading (rather than a squared loading) over 0.30 (in absolute terms) should be considered. In these analyses, the critical value of that correlation coefficient was set at 0.40.
In the set of variables relating to the financial situation of communes, the first canonical root has the greatest factor loading for the variable FC10 (0.9388); the second canonical variate has the greatest factor loading for the variable FC3 (−0.8468); it is FC27 (−0.5063) in the third one; FC29 (0.4255) in the fourth one; FC7 (0.4742) in the fifth one; and FC17 (0.3649) is the last statistically significant canonical variate. In turn, when it comes to the set of variables relating to the standards of living, the first canonical root has the greatest factor loading for the variable S10 (0.9357); it is the variable S1 (−0.7669) in the second one; S7 (−0.5800) in the third one; S20 (−0.6199) in the fourth one; S3 (0.6592) in the fifth one; and S1 (−0.3091) is the last statistically significant canonical variate.
Some researchers believe that canonical factor loadings must be used in interpreting each canonical variate because they are easy to intuitively understand. However, it needs to be noted that these coefficients tell how much correlation there is between single-input variables and canonical variates; unlike canonical weights, they do not take into account co-variability effects inside the set of input variables under consideration. As a consequence, the interpretation of canonical roots based on correlation coefficients can lead to other findings than a more complete “multidimensional” interpretation underpinned by canonical weights [
64]. This was the decisive argument for relying on the latter interpretation method in this analysis.
Based on canonical weights and factor loadings (and substantive grounds), it can be concluded that the first statistically significant canonical root explained the following relationships:
the greater the share of water and sewage infrastructure in the budget, the higher the development level of water and sewage infrastructure;
the quantitative and qualitative conditions of roads improve along with: an increase in the share of share of expenditure on transportation and communications infrastructure in the budget; an increase in investment expenditure per capita; an increase in the share of expenditure on assets in total expenditure; and the growth in importance of car tax in the budget;
a positive relationship exists between the share of public security expenditure in the budget, the share of sports and leisure (including community centers, tourism infrastructure) expenditure in the budget, the share of environmental protection expenditure (on the one side) and a higher ranking of the conditions for enjoyable ways of spending free time (on the other side);
a positive relationship exists between the importance of revenue from forestry tax (largely dependent on how densely the area is covered by forests) and from property tax (the amount of revenue from property tax somehow reflects the residents’ economic activity—which has an effect on the amount and ways of spending their free time—because revenue streams from owners of property not related to an economic activity are much smaller than when related to an economic activity) on the one side and the conditions for spending free time on the other;
as total revenue and own revenue per capita grows, there is an improvement in the quantitative and qualitative condition of roads, in the conditions for enjoyable ways of spending free time and in the development level of water and sewage infrastructure.
When analyzing factor loadings and canonical weights in the second statistically significant canonical root, note that as the share of healthcare expenditure in the budget declines, so does the availability and quality of public healthcare services.
The following can be concluded based on the values of factor loadings and canonical weights generated for the third canonical root:
growth in the self-financing ratio of the commune, in the share of operating surplus in total revenue, in the share of operating surplus and of revenue from property sold in total revenue entails improvements in: public security; the ability for adults to improve their skills; the cleanness of the natural environment; and the rating of tourism facilities;
a similar relationship exists for the ratio of total liabilities per capita; this is probably because, on the one hand, higher liabilities per capita suggest the local government unit has a smaller financial potential in fulfilling its tasks. However, on the other hand, the liabilities usually result from the need to rely on external resources in financing investments. Hence, a high liabilities per capita ratio could mean the LGU is a highly active investor and is capable of ensuring greater revenue in the future cf. [
66].
when there is growth in: the ratio of debt servicing expenses to total revenue; the ratio of debt servicing expenses to own revenue; and the share of maturing liabilities in total liabilities, there might be a deterioration in: public security; the ability for adults to improve their skills; the cleanness of the natural environment; and the rating of tourism facilities.
Conversely, when comparing canonical weights to factor loadings for the fourth canonical variate, it can be concluded that as the commune’s self-financing ratio (which represents the degree to which the local government units finances its investments with its own funds) grows, and as the total debt level declines, there is improvement in: education conditions for children and youth; street lighting; shopping conditions; and the development level of sports and cultural services. In turn, based on values of canonical weights and factor loadings for the fifth canonical root, it can be stated that as the share of sports and leisure expenditure in the budget grows, there is an improvement in the conditions for spending free time and in education conditions for children and youth.
Next, according to the analysis of factor loadings for the sixth canonical root, none of the factor loadings exceeded the critical value of 0.40. Therefore, factor loadings and canonical weights were not interpreted for this canonical variate.
The analyses included calculating the squared coefficient of correlation, referred to as the coefficient of determination, which reflects the proportion of variance in one variable explained by another one. By squaring the factor loading values (representing the correlation), the author determined how much variance of a variable is explained by the canonical variate. Moreover, the mean value of that proportion for all variables tells the average percentage of variance explained by the given canonical variate in that dataset. This kind of variance is referred to as variance extracted (see
Table 8).
The next step consisted in multiplying the eigenvalues of the matrix related to the matrix of correlations between the variables of the two sets by the squared canonical correlation. This resulted in a new “synthetic indicator” referred to as the redundancy of a set of variables with respect to another set. It shows the portion of mean variance in one set explained by the given canonical variate when another set is known (in other words, how redundant a dataset is when another dataset exists). Total redundancy means the sum of redundancies calculated for all canonical variates created.
The most statistically important canonical variate extracts nearly 17% of variance in the set of variables related to the financial situation of communes and almost 14% in the second set (related to the population’s standards of living). In turn, the second canonical variate extracts ca. 8% in the first set and over 5% in the second. For other canonical roots, variance extracted varied in the range from below 2.8% (for the sixth canonical root) to 7.2% (for the third canonical root) in the set of variables relating to the financial situation of communes, and from below 3.0% (sixth canonical root) to nearly 10% (fourth canonical root) in the set of variables relating to the population’s standards of living.
The latter set of variables can explain 14.6%, 5.7%, 3.4%, 2.0%, 1.1% and 0.9% (respectively) of variance in the set of variables relating to the financial situation of communes. Conversely, the set of input variables referring to the financial situation of communes can explain 12.0%, 3.8%, 3.7%, 4.1%, 2.4% and 1.0% (respectively) of variance in the second set based on the first six statistically significant canonical variates. Hence, the third and successive statistically significant canonical variates have only a small contribution to explaining the variation.
The analysis also included calculating total redundancy (it seems to be one of the key indicators generated in a canonical analysis), interpreted as the mean percentage of variance explained in a set of variables with a given second set based on all canonical variates. It follows from the calculations that when the variables relating to the financial situations of communes are known, they can explain 32.34% of variance in the set of variables used in describing the population’s standards of living. The value of this indicator can be considered high. In order to obtain better results, it would be worthwhile to carry out a study with another set of input variables and another number of variables.
The dispersion graph (
Figure 1) for the first statistically significant canonical variate does not reveal a strong dispersion of points representing the objects coved by the analysis. The points are positioned close to a straight line (with a positive slope). The above means that these pairs of canonical variates convey a significant portion of information on co-variability between the two sets of input variables considered. The closeness of most points (which, in this canonical analysis, represent selected Polish communes) could mean the input variables share a similar structure. In the dispersion graph for the last statistically significant canonical variate, the points representing the objects covered by the analysis are also positioned along a positively sloped line but are more dispersed with respect to it. It means that this pair of canonical variates conveys much less information on co-variability between the two variables considered than the first pair of canonical variates.
5. Discussion
The canonical analysis allowed the author to present a comprehensive picture of the complex relationship structure between two sets of variables. The procedure included:
specifying the level of impact the set of independent variables (financial situation of Polish communes) has on the set of dependent variables (population’s standards of living);
determining the degree of correlation between each original variable in both sets and the canonical variate (factor loadings were calculated);
calculating the contribution (canonical weight) of each original variable to the canonical variate;
calculating the variances extracted which tell the part of total variation in the set that can be attributed to a specific canonical variate;
calculating the redundancy index which allows the author to tell how much mean variance in a set is explained by a specific canonic variate of the second set, and what the total redundancy expressed as the sum of redundancies for all canonical variates is
Figure 2 presents a synthetic view of key results.
The review of the literatures on the use of the canonical analysis proves it to be one of the least widespread statistical methods in social sciences. Therefore, it is rarely used in the context of the population’s standard (or quality) of living and its determinants (especially the financial situation of local government units). For this reason, it is difficult to relate the results obtained to the results of other authors. Hence, it is worth mentioning the study by O.R. Ebenezer [
67], who carried out a canonical analysis of data for the Ekiti state in Central Nigeria and demonstrated that a positive correlation exists between poverty level and literacy skills. M. Krzyśko et al. [
55] used the canonical analysis to identify the relationship between the quality of life and the level of human capital, on one side, and the development level of higher education facilities in Polish voivodeships, on the other. In turn, K. Chin-Tsai [
68] carried out a canonical analysis with a view to assess the relationships between the cyclists’ quality of life and how satisfied they are with their jobs. Additionally, E.A. Vanner et al. [
69] uses the canonical analysis to explore the multidimensional relationships between having a physical mental handicap and being socially/environmentally disabled, on one side, and the levels of physical and recreational activity and the quality of life, on the other.
The relatively small popularity of the canonical analysis in economic research can be explained by it being quite complex (it has a number of prerequisites, including the knowledge of multiple regression). Additionally, it demonstrates certain difficulties in interpreting the outcomes, due to such aspects as the large number of indicators calculated. However, considering the multifaceted nature of the phenomena covered by this study, it seems reasonable to use this multidimensional exploration technique in assessing the interactions between them. When investigating into multifaceted developments, using tools such as multiple regression models and investigating into each explained variable one by one could contribute to the possibility of narrowing and distorting the results of analyses, because there would be a risk of losing relevant data on interactions in sets of explained variables. Furthermore, it seems insufficient to rely solely on a classical correlation analysis between pairs of variables, as it fails to address the relationships inside the sets of variables covered. In turn, though frequently used, multiple correlation can only be used to measure non-linear or linear relationships between one variable and a set of explanatory variables.
This study did not group the communes into rural, urban and urban–rural units. Indeed, as demonstrated by A. Bieniasz et al., the mean synthetic metric of financial situation of Polish communes did not differ between the types of communes. However, rural communes are relatively much less financially autonomous than other types. Additionally, they demonstrated a remarkably higher level of current transfers per capita, and differed from other types (especially from urban to rural units) in the levels of operating surplus and debt per capita (each being highly variant) [
66]. A. Standar also demonstrated that irrespective of their type, most communes (70%) in the Wielkopolskie voivodeship have a medium financial situation. The best financial situation is witnessed in communes located in the immediate vicinity of big cities, whereas a remote location mostly restricts the revenue potential which reduces investment capacity [
70].
In future research projects, it would be worthwhile to weigh the diagnostic sub-variables. It should however be noted that the literature often calls into question the procedure for weighing variables related to spatial data cf. [
61,
71]. It recommends that weight coefficients not be assigned to diagnostic variables for a number of reasons, including the fact that variables other than the selected ones would be assigned zero weights on an a priori basis. Another issue not addressed in this study is the distinction between an urban and a rural environment which would certainly enable a more complete analysis of this topic (as these areas differ in the functions they deliver).