Next Article in Journal
Rural Financial Development Impacts on Agricultural Technology Innovation: Evidence from China
Next Article in Special Issue
Adenomyosis and Infertility—Review of Medical and Surgical Approaches
Previous Article in Journal
Effect of Simulated Gastrointestinal Tract Conditions on Survivability of Probiotic Bacteria Present in Commercial Preparations
Previous Article in Special Issue
Perinatal Outcomes in a Population of Diabetic and Obese Pregnant Women—The Results of the Polish National Survey
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Spatio-Temporal Analysis of the Health Situation in Poland Based on Functional Discriminant Coordinates

by
Mirosław Krzyśko
1,
Waldemar Wołyńki
2,*,
Marcin Szymkowiak
3,4 and
Andrzej Wojtyła
5
1
Interfaculty Institute of Mathematics and Statistics, Calisia University-Kalisz, 62-800 Kalisz, Poland
2
Faculty of Mathematics and Computer Science, Adam Mickiewicz University, 61-614 Poznań, Poland
3
Institute of Informatics and Quantitative Economics, Poznań University of Economics and Business, 61-875 Poznań, Poland
4
Statistical Office in Poznań, 60-624 Poznań, Poland
5
Health Sciences Faculty, Calisia University-Kalisz, 62-800 Kalisz, Poland
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2021, 18(3), 1109; https://doi.org/10.3390/ijerph18031109
Submission received: 7 December 2020 / Revised: 21 January 2021 / Accepted: 22 January 2021 / Published: 27 January 2021

Abstract

:
The aim of this study was to investigate if the provinces of Poland are homogeneous in terms of the observed spatio-temporal data characterizing the health situation of their inhabitants. The health situation is understood as a set of selected factors influencing inhabitants’ health and the healthcare system in their area of residence. So far, studies concerning the health situation of selected territorial units have been based on data relating to a specific year rather than longer periods. The task of assessing province homogeneity was carried out in two stages. In stage one, the original spatio-temporal data space (space of multivariate time series) was transformed into a functional discriminant coordinates space. The resulting functional discriminant coordinates are synthetic measures of the health situation of inhabitants of particular provinces. These measures contain complete information regarding 8 diagnostic variables examined over a period of 6 years. In the second stage, the Ward method, commonly used in cluster analysis, was applied in order to identify groups of homogeneous provinces in the space of functional discriminant coordinates. Sixteen provinces were divided into four clusters. The homogeneity of the clusters was confirmed by the multivariate functional coefficient of variation.

1. Introduction

Health is universally regarded as one of the most highly appreciated values. Good health is the main factor contributing to people’s well-being, which enhances their opportunities to participate in social life and to benefit from economic and employment growth. Better health is also consistently associated with greater life satisfaction (Ngamaba et al. [1]). A good health situation is instrumental in achieving good labor market outcomes. By reducing the individual’s capacity to work long hours, a deteriorated health status decreases their chance of getting employed and being productive at work and has a strong impact on the labor market situation (James et al. [2], OECD [3]).
For the purpose of this study, the term “health situation” is understood as inhabitants’ health described by the set of selected indicators and healthcare access in their area of residence. According to Penchansky and Thomas [4], healthcare access can be defined as a multi-faceted concept expressing the “degree of fit” between clients (patients) and the healthcare system according to five important dimensions: availability, accessibility, accommodation, affordability, and acceptability. In our article, we focus mainly on the first dimension, i.e., availability, which represents the “spatial” component of healthcare access. This choice is motivated by the fact that our understanding of availability is the same as that presented by Penchansky and Thomas [4], i.e., as the relationship between the number and type of existing services (and resources) and the number of patients and types of their needs. In other words, it represents the adequacy of the supply of physicians, dentists, and other providers; of facilities, such as clinics and hospitals; and of specialized programs and services, such as mental health and emergency care. The second dimension highlighted by Penchansky and Thomas [4], i.e., accessibility, also represents the “spatial” component of healthcare access and is understood as the relationship between the location of supply and the location of clients, while accounting for clients’ transportation resources and travel time, distance and cost. In our approach, however, we do not take this spatial component into account, mainly due to limited data availability. However, it should be emphasized that the literature on accessibility in the context of healthcare access is wide (see, for instance, Wang [5] and Neutens [6]). There is also a rich literature devoted to applications of the access concept to various health care services, e.g., regarding health care accessibility analyzed in connection with availability (Barbarisi et al. [7], Bruno et al. [8], Lu et al. [9], Okuyama et al. [10], Pu et al. [11]).
The importance attached to health by different organizations is reflected in the way health protection is implemented in public policies, particularly in health policy, which aims to have a positive influence on population health (De Leeuw et al. [12]). This policy can be understood as government decisions and plans of action to make progress towards achieving the goals of the health system: improved health status of the population, better financial risk protection, and better client satisfaction; or intermediate outcomes for health systems, which include: quality, access, and efficiency (Campos and Reich [13]). In many countries, the main objective of public health policy is to create the conditions for good and equitable health for the entire population and within specific groups and to eliminate avoidable health inequalities. It should also be underlined that decisions made in sectors outside of public health and health care, such as education, transportation, and criminal justice, strongly affect health and well-being (Pollack et al. [14]).
As mentioned above, the main objective of activities in the area of health policy is to improve the health of the population, and this improvement is nowadays understood in two ways (Łyszczarz [15]). First, in terms of improving the average health status, e.g., measured in terms of life expectancy or premature mortality. Second, increasing the importance attached to the issue of inequalities in health. The term ‘health inequality’ refers to differences in the health of individuals or specific subgroups; any measurable aspect of health that varies across individuals or according to socially relevant groupings can be called a health inequality (Boyle [16], Kawachi et al. [17], Arcaya et al. [18]). The aim of actions in the field of health policy is to reduce such inequalities. Large inequalities in health status exist across population groups, countries and specific regions within countries (Wojtyła-Buciora et al. [19], Wojtyła et al. [20]). These health inequalities are linked to many factors, including differential exposure to health risk factors and access to health care (Samet [21]). Inequalities in health are mainly manifested as differences in the health status between socioeconomic groups, but they can also described in terms of employment status, sex or geographic location (Crombie et al. [22]). The prospect of reducing inequalities seems to be increasingly important in contemporary research trends on the implementation of health policy in the world (see, e.g., Spinakis et al. [23]). It should also be emphasized that health inequalities have to be considered as a global problem, which not only affects populations of the poorest countries and regions but also those of the richest ones; persistent health inequalities are among the most serious and challenging health problems worldwide (Barreto [24]). How policies can reduce the main factors of health inequality and promote health equality will be a key challenge for public health in the future.
Monitoring population health and eliminating health inequalities are essential activities aimed at maintaining and improving public health. The main goals of health monitoring involve measuring the extent of health problems, their trends, and the degree of variation between different population groups, including spatial distribution, as well as identifying priority areas for public health (Pizot et al. [25]). Another objective of monitoring is to track the current health situation at the national and local level. This is especially important today, in the era of the Covid-19 pandemic, when up-to-date information is required at lower levels of spatial aggregation.
The unequal spatial distribution of resources, such as clinics, hospitals, nurses, pharmacies, or doctors, could make entire communities more vulnerable and less resilient to adverse health effects. That is why the health situation needs to be investigated by accounting for spatial differences to gain a deeper understanding of why and how some geographical areas experience different health than others (Ozdenerol [26]). Understanding the role played by location in shaping the geographic distribution of the health situation within countries is critical for informing appropriate public health policy regarding prevention and treatment (Casper et al. [27]).
There are numerous articles about the spatial variation in the health situation, health inequalities or health conditions at the local or national level (see, for instance, Gilliland et al. [28], Wang and Nie [29], Chen et al. [30]). In the case of Poland, various analyses have been conducted to investigate regional inequalities in the health status of the population (Wierzbicka [31], Bem et al. [32]).
Interestingly, all studies mentioned above were based on data for a specific year or, in some cases where comparative analysis was involved, for two years (e.g., Shi et al. [33], Hübelová et al. [34]). If the authors of these articles chose, say, p variables describing the health situation of a given territorial unit, then the obtained data were p-dimensional vectors or points in a p-dimensional Euclidean space.
This article presents a more general approach to investigating the health situation across territorial units, which is based on spatio-temporal data. This kind of data is more general than static vector data as it takes into account changes that happen over time. The statistical methodology involving the use of functional discriminant coordinates and cluster analysis is applied to available data for Poland. However, this approach can be used to investigate the health situation or other phenomena at lower levels of spatial aggregation in other countries. For this reason, its results may be useful for policy-makers in the field of public health. The data to measure the health situation in Poland come from the Local Data Bank (LDB). Several important variables related to the health situation observed at the level of districts (LAU—also called poviats) located within provinces (approximately equivalent to NUTS2 (regions) level and also called voivodships) (see Section 2) in the period 2013–2018 were taken into account in the analysis. More specifically, each district is described by 8 variables representing the situation over 6 years. The data for 380 districts were arranged in the form of a matrix with 6 rows and 8 columns, containing a total of 18,240 numerical values.
The main aim of this article is to determine whether Polish provinces are homogeneous in terms of spatio-temporal data characterizing their health situation. In order to answer this question, three multivariate statistical methods were used: multivariate functional discriminant coordinates analysis (MFDCA), functional cluster analysis (FCA), and the multivariate functional coefficient of variation (MFCV).
In the first step, spatio-temporal data were transformed into functional data by applying a continuous function of time t (see, e.g., Górecki and Krzyśko [35]). Functional data can be regarded as realizations of the random process X ( t ) . Then, functional discriminant coordinates were constructed in the functional data space, and further calculations were performed in the functional discriminant coordinate space.
At this point, an important question arises: do the functional data recorded as continuous functions really exist and can these multivariate functions actually be derived? This question is critical because, in practice, values of an observed random process are always recorded in discrete moments in time, sparsely or densely distributed in the interval of variability over time. Thus, in this case, we encounter a time series or, in other words, a highly-dimensional vector of observations. However, there are numerous reasons why it is useful to model a time series as a continuous function (elements of a certain functional space); one of them is that functional data have many advantages in comparison to other representations of time series. In particular, the MFDCA derived in the present study has the following statistical advantages:
  • Firstly, functional data are normally used to cope with the problem of missing observations, which is inevitable in many areas of applied research. Unfortunately, most methods concerning data analysis require complete time series. The removal of a time series with missing observations from a data set is one of popular solutions, but this can lead, and in most cases does lead, to serious data loss. Another possibility is to use one of the many methods of missing data prediction, but, in that case, the results will depend on the interpolation method. Contrary to these approaches, in the case of functional data, the problem of missing observations is resolved by expressing a given time series in the form of a continuous function set.
  • Secondly, in the statistical development of MFDCA, the structure of observations is naturally retained when using functional data, i.e., the temporal link is maintained and the information regarding any measurement is taken into account. Consequently, results are assumed to be robust.
  • Thirdly, moments of observation do not have to be equally spaced in a particular time series, which can be a major advantage in online applications.
  • Fourthly, when using functional data, one avoids the problem of dimensionality. When the total number of time points in which observations are made exceeds the number of time series under analysis, most statistical methods do not provide satisfactory results because of misleading false estimates. In the case of functional data, this problem can be avoided because the time series are replaced by a set of continuous representative functions, which are independent of the time points in which observations are made.
The construction of functional discriminant coordinates is described in Górecki et al. [36], and their application to fruit data can be found in Hanusz et al. [37]. Two other proposals are: kernel discriminant coordinates (Krzyśko et al. [38]) and discriminant coordinates with the additional condition imposed on the covariance matrix (Krzyśko et al. [39]).
In the second step, cluster analysis was used to distinguish between groups of homogeneous provinces. Ward’s hierarchical clustering method was chosen as a commonly used technique in cluster analysis. Moreover, to determine whether obtained clusters are homogeneous, a functional multivariate coefficient of variation was applied.
The main value of this article, according to its authors, is the proposed statistical methodology. Despite the use of country-specific data for the purpose of spatial analysis of the health situation, the presented methods are universal and can be successfully applied to any territorial unit and spatio-temporal dynamic data connected to other phenomena (e.g., poverty or the labor market situation at lower levels of spatial aggregation).
This article is organized as follows. Section 2 contains a short description of data used to analyze differences in the health situation across Polish provinces. The section also provides a description of the administrative division of Poland and details of the procedure of data standardization, as well as their transformation into functional data. Section 3 presents the statistical methodology involving the use of functional discriminant coordinates, cluster analysis, and the functional multivariate coefficient of variation. How this approach was applied to real data describing the health situation in Poland is described in Section 4. Finally, concluding remarks and further steps to be taken in the future are provided in Section 5.

2. The Data

The original data set contains values of p = 8 variables characterizing the health situation of the population (see Table 1). All variables come from the LDB, which is Poland’s largest database of information relating to the economy, society and the environment. Data and statistical indicators in the LDB describe entire country, as well as units representing three NUTS levels: macroregions (NUTS1), regions (NUTS2), and subregions (NUTS3).
Table 1 also contains information about variable type, with S denoting the so-called stimulant, where a higher value means a better situation (in terms of health), and D denoting the so-called destimulant, where lower values represent a better situation (Walesiak and Dudek [40]).
The variables were selected with a view to obtaining a relatively comprehensive description of the health situation of the population and taking into account their availability and completeness. The data cover the period 2013–2018, i.e., T = 6 years and describe n = 380 districts located within 16 provinces (see Table 2).
Provinces are essentially equivalent to NUTS2 units, while districts are the upper level of local administrative units, which are currently not part of the NUTS system. The NUTS classification (Nomenclature of territorial units for statistics) is a geographical standard used for a statistical division of the EU Member States economic territories into three regional levels of specified classes of the population. It was established in order to enable the collection, compilation, and dissemination of harmonized regional statistics in the European Union. More information about the administrative division of Poland can be found at https://stat.gov.pl/en/regional-statistics/classification-of-territorial-units/administrative-division-of-poland/. Figure 1 shows the administrative division of Poland into provinces and districts (the left panel) and the division of opolskie (as an example) into districts (the right panel).
The values of the selected variables, expressed in different measurement units and having different ranges of variation, were standardized using the method of zero unitization (see, for example, Jajuga and Walesiak [41]).
Subsequently, the unitized data were transformed into functional data using the least squares method (see, e.g., Górecki and Krzyśko [35]).
Now, let us assume that the d-th component of the Z process can be represented by a finite number of orthonormal basis functions { φ b } :
Z d ( t ) = b = 0 B d α d b φ b ( t ) , t I , d = 1 , 2 , , p ,
where α d b are random variables such that Var ( α d b ) < for d = 1 , 2 , , p and b = 0 , 1 , , B d .
Let
α = ( α 10 , , α 1 B 1 , , α p 0 , , α p B p )
and
Φ ( t ) = φ B 1 ( t ) 0 0 0 φ B 2 ( t ) 0 0 0 φ B p ( t ) ,
where φ B d = ( φ 0 , , φ B d ) , d = 1 , , p , α R K + p , Φ R p × ( K + p ) , K = B 1 + + B p . Then,
Z ( t ) = Φ ( t ) α , t I .
Individual years (time points) were assigned the following values: t 1 = 0.5 ( 2013 ) , t 2 = 1.5 ( 2014 ) , , t 6 = 5.5 ( 2018 ) . The ϕ functions are considered on the interval I = [ 0 , T ] = [ 0 , 6 ] . The Fourier base of the form
ϕ 0 ( t ) = 1 / T , ϕ 2 k 1 ( t ) = 2 / T sin ( 2 π k t / T ) , ϕ 2 k ( t ) = 2 / T cos ( 2 π k t / T ) ,
where t [ 0 , T ] , k = 1 , 2 , , was adopted as the orthonormal basis. Górecki and Krzyśko [35] showed that the Fourier base leads to a minimal number of terms in the expansion of a given function into a series, which is a desirable feature because expansion coefficients play the role of new variables in the functional approach. Given the small number of time points, for each of the 8 variables, the number of expansion terms was the same and equal to 5. Hence, B 1 = = B 8 = 4 , K = B 1 + + B 8 = 32 , K + p = 40 . Thus, α R 40 and Φ R 8 × 40 .
Figure 2 shows the functional data (average values) for 8 variables and 16 provinces. One can see how the values of individual variables vary over time and between provinces.

3. Statistical Methodology

3.1. Functional Discriminant Coordinates

Our purpose is to construct a discriminant coordinate based on multivariate functional data, i.e., to construct
U = < u , Z > = I u ( t ) Z ( t ) d t
such that their between-class variance is maximal compared with the within-class variance, where
u ( t ) = Φ ( t ) γ .
The construction of functional discriminant coordinates is described in Górecki et al. [36] and Hanusz et al. [37].
The construction of discriminant coordinates for the random process Z essentially consists in constructing classical discriminant coordinates for a random vector α because the discriminant component U k has the form U k = γ k α , where α is the random vector in the representation Z ( t ) = Φ ( t ) α of the random process Z , and γ k is an eigenvector in the generalized eigenproblem ( B λ k W ) γ k = 0 , where B and W are the between-class and within-class matrices, respectively.
Remark 1.
The examination of the elements of the vector weight function for the original processes in each discriminant coordinate (elements of the vectors u k ) helps to interpret the principal axes of between-class variation.
At a given time point t, the greater the absolute value of a component of the vector weight function, the greater the contribution in the structure of the given functional discriminant coordinate, from the process Z corresponding to that component. The total contribution of a particular original process Z i in the structure of a particular functional discriminant coordinate is equal to the area under the module weighting function corresponding to this process.
In practice, vector α is unknown and must be estimated based on the sample. Let z i 1 , z i 2 , , z i n i be a sample belonging to the i-th class, where i = 1 , 2 , , L . The function z i j has the form
z i j ( t ) = Φ ( t ) a i j ,
where a i j = ( a 10 ( i j ) , , a 1 K 1 ( i j ) , , a p 0 ( i j ) , , a p K p ( i j ) ) , i = 1 , 2 , , L , j = 1 , 2 , , n i .
Let
a ¯ i = 1 n i j = 1 n i a i j , a ¯ = 1 n i = 1 L n i a ¯ i , i = 1 , , L , n = n 1 + + n L .
Then,
B ^ = 1 L 1 i = 1 L n i ( a ¯ i a ¯ ) ( a ¯ i a ¯ ) ,
W ^ = 1 n L i = 1 L j = 1 n i ( a i j a ¯ i ) ( a i j a ¯ i ) .
Next, we find the non-zero eigenvalues λ ^ 1 🟉 λ ^ 2 🟉 λ ^ s 🟉 and the corresponding eigenvectors γ ^ 1 , γ ^ 2 , , γ ^ s of the matrix W ^ 1 B ^ , where s = m i n ( K + p , L 1 ) . Hence,
u ^ k ( t ) = Φ ( t ) γ ^ k ,
and the coefficients of the projection of the j-th realization z i j of the process Z belonging to the i-th class on the k-th functional discriminant coordinate are equal to:
U ^ i j k = < u ^ k , z i j > = γ ^ k a i j ,
for i = 1 , 2 , , L , j = 1 , 2 , , n i , k = 1 , 2 , , s .
The plots of the pairs ( U ^ i j 1 , U ^ i j 2 ) provide a visual representation of the relative position of groups in the two-dimensional space. Since the configuration obtained is deemed to be optimal in terms of the ability to discriminate between the groups, wide overlaps are to be considered as a sign of no or small differences between the groups involved.

3.2. Cluster Analysis

Provinces that are homogeneous in terms of the considered variables were identified using cluster analysis. More precisely, we applied Ward’s hierarchical clustering method (see, for example, Seber [42], Chapter 7; Mirkin [43]; Krzyśko et al. [44], Chapter 12). The clustering procedure is based on the Mahalanobis distance between the provinces.
Let U ^ i j = ( U ^ i j 1 , , U ^ i j s ) . This distance is defined by the following formula:
d i j 2 = ( U ¯ i U ¯ j ) S 1 ( U ¯ i U ¯ j ) ,
where
U ¯ i = 1 n i j = 1 n i U ^ i j , i = 1 , 2 , , L ,
and
S = 1 n L i = 1 L j = 1 n i ( U ^ i j U ¯ i ) ( U ^ i j U ¯ i ) , n = n 1 + + n L .
The Mahalanobis distance takes into account not only the difference between the mean vectors of two provinces; the difference is also weighted by the variances and covariances of the examined variables estimated for all provinces (the differentiation of districts around the mean provinces was taken into account).

3.3. Functional Multivariate Coefficient of Variation

Let Z = ( Z 1 , , Z p ) , be a p-dimensional random process with mean function μ = ( μ 1 , , μ p ) 0 . We assume that the process Z belongs to the Hilbert space L 2 p ( I ) of p-dimensional vectors of square integrable functions on I.
The functional multivariate coefficient of variation (MFCV) for the random process Z is defined as follows (Krzyśko and Smaga [45])
MFCV = Var ( < μ * , Z > μ ,
where μ * ( t ) = μ ( t ) / μ , t I .
If process Z has the form (1), then
MFCV = a J Φ Σ α J Φ a ( a J Φ a ) 2 ,
where J Φ = diag ( J ϕ 1 , , J ϕ p ) , J ϕ k = I ϕ k ( t ) ϕ k ( t ) d t , and Σ α = Cov ( α ) is the B k × B k cross product matrix corresponding to the basis { ϕ k l } l = 1 , k = 1 , , p . For the orthonormal basis, for instance the Fourier basis, the cross product matrix is equal to the identity matrix. Then (Albert and Zhang [46]),
MFCV = a Σ α a ( a a ) 2 .

4. Results

To construct functional discriminant coordinates, we calculated the estimates a i of the vectors α i , i = 1 , , L .
The vectors a i were then used to construct the estimator B ^ of the matrix of between-class variability and the estimator W ^ of the matrix of within-class variability. Next, the non-zero eigenvalues of λ ^ k 🟉 of the matrix W ^ 1 B ^ and the corresponding eigenvectors γ ^ k , k = 1 , , 15 , were calculated.
Multivariate functional discriminant coordinates have the form:
U ^ k = < u ^ k , Z > ,
where
u ^ k ( t ) = Φ ( t ) γ ^ k , k = 1 , , s ,
s = m i n ( K + p , L 1 ) = m i n ( 40 , 15 ) = 15 ,
are vectors of weight functions.
We treat the resulting multivariate functional discriminant coordinates as indicators (synthetic measures) of the health situation of inhabitants of Polish provinces. These indicators contain full information on the values of 8 diagnostic variables measured over 6 years. They are, therefore, composite indicators of the health situation.
These 15 composite indicators have a different power of differentiating between the provinces (these indicators have different variances (eigenvalues); see Table 3).
The first indicator is the strongest and the fifteenth is the least powerful. It is not possible to see the mutual position of provinces in the 15-dimensional space of these indicators, but it is possible in the space of the first two composite indicators that differentiate between the provinces most clearly.
It can be noticed that 44.1% of total variability is attributed to the first two multivariate functional coordinates.
The mean values of the 16 provinces in the system of the first two functional discriminant coordinates are presented in Table 4.
The location of the 16 provinces in the system of the first two functional discriminant coordinates is shown in Figure 3.
The total contribution of the individual variables to the structure of the particular functional discriminant coordinates can be estimated using the area under the absolute value of the weight functions corresponding to a given variable. The graphs of the eight components of the vector weight function for the first and second functional discriminant coordinates are shown in Figure 4.
These contributions, for the first and second functional discriminant coordinates for 8 variables are also given in Table 5. Table 5 shows that the largest share in the construction of the first functional discriminant coordinate is played by variable No. 2 (Doctors per 10,000 population)—32.0%—and variable No. 7 (Number of doctor consultations per 10,000 population)—14.7%. On the other hand, variable No. 4 (Deaths of people due to cardiovascular disease per 100,000 population)—21.0%—and variable No. 2 (Doctors per 10,000 population)—20.2%—have the greatest share in the construction of the second functional discriminant coordinate. Values of coefficients of the vector weight functions are also presented in Table 5.
In the next step, cluster analysis was used to select groups of homogeneous provinces in a fifteen-dimensional space of functional discriminant coordinates. The Ward method was selected as a commonly used technique. The Mahalanobis distance was chosen as a measure of the distance between the mean vectors of individual provinces. The obtained dendrogram is presented in Figure 5.
We obtained four homogeneous clusters. Which cluster individual provinces belong to is shown in Table 6 and in Figure 6 (spatial distribution).
Taking into account the spatial variation in the health situation of the provinces, it is possible to distinguish four spatial clusters. The first one, denoted as II, consists of six provinces located in the north-western part of Poland. At the opposite end of the country, there are four provinces that make up cluster III. Finally, in the approximate middle belt, one can see cluster I, consisting of three provinces. Cluster IV consists of two non-contiguous provinces (podlaskie and opolskie), located in different parts of Poland.
Figure 6 shows that Poland can be divided into two part: Western Poland (provinces belonging to clusters I and II) and Eastern Poland (provinces belonging to clusters III and IV). The health of inhabitants living in the provinces belonging to cluster I is the best, while that of people living in the provinces belonging to cluster IV is the worst. All the previous studies conducted in Poland show that Western Poland is better developed in socio-economic terms than Eastern Poland (see, for instance, Szymkowiak et al. [47], Marchetti et al. [48], Roszka [49]). Current research shows that this division is also valid as regards the health situation.
Decision-makers at national and local government levels should be advised to redirect more funds to improve the health situation of inhabitants of Easter Poland.
To verify that the obtained four clusters are really homogeneous, a multivariate functional coefficient of variation (MFCV) was calculated for all provinces together and for each cluster separately (see Table 7).
As can be seen, the coefficients of variation for individual clusters are lower than that for all indeed homogeneous.

5. Conclusions

The above statistical analysis provides evidence for the conclusion that the provinces are not homogeneous in terms of the selected variables characterizing the health situation of their inhabitants. The analysis consisted of multiple steps. Values of the selected variables, which are expressed in different measurement units and have different ranges of variation, were standardized using the method of zero unitization. Then, the unitized data were transformed into functional data in order to enable the construction of discriminant coordinates in the functional data space. The multivariate functional discriminant coordinates were treated as composite indicators (synthetic measures) of the health situation of inhabitants of Polish provinces. These indicators contain full information on the values of 8 diagnostic variables measured over 6 years.
In the next step, cluster analysis was applied to select groups of homogeneous provinces in the space of functional discriminant coordinates using the Ward method. The Mahalanobis distance was chosen as a measure of distance between the mean vectors of individual provinces. The homogeneity of the resulting four clusters was analyzed using a multivariate functional coefficient of variation, which was calculated for all provinces together and for each cluster separately. It turned out that the coefficients of variation for individual clusters are smaller than the corresponding value for combined provinces, which confirms that the clusters are indeed homogeneous.
The obtained clusters illustrate changes in the situation of the provinces (over a period of six analyzed years). In previous studies, data for each year are analyzed separately using classical statistical methods. However, one must not forget that one deals with spatio-temporal data that change over time.
The authors realize that the choice of diagnostic variables may be a weakness of this study. These particular diagnostic variables were selected with a view to obtaining relatively comprehensive description of the health situation of the population, given their availability and completeness. Therefore, the selection should be treated mainly as an illustration of the proposed statistical methodology for processing spatio-temporal data.
The statistical methods used for multivariate functional data were suggested earlier by the authors of this paper.

Author Contributions

Conceptualization, M.K.; Methodology, M.K. and W.W.; Software, W.W. and M.S.; Validation, A.W.; Formal analysis, M.K., W.W., and M.S.; Investigation, M.K., W.W., M.S., and A.W.; Data curation, M.S.; Writing—original draft preparation, M.K., W.W., M.S., and A.W.; Writing—review and editing, M.K., W.W., M.S., and A.W.; Visualization, W.W. and M.S.; Funding acquisition, A.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available in the Local Data Bank, which is Poland’s largest database of information relating to the economy, society and the environment.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ngamaba, K.H.; Panagioti, M.; Armitage, C.J. How strongly related are health status and subjective well-being? Systematic review and meta-analysis. Eur. J. Public Health 2010, 27, 879–885. [Google Scholar] [CrossRef] [PubMed]
  2. James, C.; Devaux, M.; Sassi, F. Inclusive Growth and Health; OECD Health Working Papers, No. 103; OECD Publishing: Paris, France, 2017. [Google Scholar]
  3. OECD. Health for Everyone? Social Inequalities in Health and Health Systems; OECD Health Policy Studies; OECD Publishing: Paris, France, 2019. [Google Scholar] [CrossRef]
  4. Penchansky, R.; Thomas, J.W. The concept of access: Definition and relationship to consumer satisfaction. Med. Care 1981, 19, 127–140. [Google Scholar] [CrossRef] [PubMed]
  5. Wang, F. Measurement. optimization. and impact of health care accessibility: A methodological review. Ann. Assoc. Am. Geogr. 2012, 102, 1104–1112. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Neutens, T. Accessibility, equity and health care: Review and research directions for transport geographers. J. Transp. Geogr. 2015, 43, 14–27. [Google Scholar] [CrossRef]
  7. Barbarisi, I.; Bruno, G.; Diglio, A.; Elizalde, J.; Piccolo, C. A spatial analysis to evaluate the impact of deregulation policies in the pharmacy sector: Evidence from the case of Navarre. Health Policy 2019, 123, 1108–1115. [Google Scholar] [CrossRef]
  8. Bruno, G.; Cavola, M.; Diglio, A.; Piccolo, C. Improving spatial accessibility to regional health systems through facility capacity management. Socio-Econ. Plan. Sci. 2020, 71, 100881. [Google Scholar] [CrossRef]
  9. Lu, C.; Zhang, Z.; Lan, X. Impact of China’s referral reform on the equity and spatial accessibility of healthcare resources: A case study of Beijing. Soc. Sci. Med. 2019, 235, 112386. [Google Scholar] [CrossRef]
  10. Okuyama, K.; Akai, K.; Kijima, T.; Abe, T.; Isomura, M.; Nabika, T. Effect of geographic accessibility to primary care on treatment status of hypertension. PLoS ONE 2019, 14, e0213098. [Google Scholar] [CrossRef]
  11. Pu, Q.; Yoo, E.H.; Rothstein, D.H.; Cairo, S.; Malemo, L. Improving the spatial accessibility of healthcare in North Kivu, Democratic Republic of Congo. Appl. Geogr. 2020, 121, 102262. [Google Scholar] [CrossRef]
  12. De Leeuw, E.; Clavier, C.; Breton, E. Health policy–why research it and how: Health political science. Health Res. Policy Syst. 2014, 12, 55. [Google Scholar] [CrossRef]
  13. Campos, P.A.; Reich, M.R. Political analysis for health policy implementation. Health Syst. Reform 2019, 5, 224–235. [Google Scholar] [CrossRef] [PubMed]
  14. Pollack Porter, K.M.; Rutkow, L.; McGinty, E.E. The importance of policy change for addressing public health problems. Public Health Rep. 2018, 133, 9S–14S. [Google Scholar] [CrossRef] [PubMed]
  15. Łyszczarz, B. Dynamika regionalnych nierówności w zdrowiu w Polsce (Dynamics of Regional Health Inequalities in Poland). Nierówności Społeczne Wzrost Gospod. 2014, 38, 191–200. [Google Scholar]
  16. Boyle, P. Global public health—Challenges and leadership. J. Health Inequalities 2018, 4, 55–61. [Google Scholar] [CrossRef]
  17. Kawachi, I.; Subramanian, S.V.; Almeida-Filho, N. A glossary for health inequalities. J. Epidemiol. Community Health 2002, 56, 647–652. [Google Scholar] [CrossRef] [PubMed]
  18. Arcaya, M.C.; Arcaya, A.L.; Subramanian, S.V. Inequalities in health: Definitions, concepts, and theories. Glob. Health Action 2015, 8, 1–12. [Google Scholar] [CrossRef] [PubMed]
  19. Wojtyła-Buciora, P.; Stawińska-Witoszyńska, B.; Wojtyła, K.; Klimberg, A.; Wojtyła, C.; Wojtyła, A.; Samolczyk-Wanyura, D.; Marcinkowski, J.T. Assessing physical activity and sedentary lifestyle behaviours for children and adolescents living in a district of Poland. What are the key determinants for improving health? Ann. Agric. Environ. Med. 2014, 21, 606–612. [Google Scholar] [CrossRef] [PubMed]
  20. Wojtyła, C.; Biliński, P.; Paprzycki, P.; Warzocha, K. Haematological parameters in postpartum women and their babies in Poland—Comparison of urban and rural areas. Ann. Agric. Environ. Med. 2011, 18, 380–385. [Google Scholar]
  21. Samet, J.M. The environment and health inequalities: Problems and solutions. J. Health Inequalities 2019, 5, 21–27. [Google Scholar] [CrossRef]
  22. Crombie, I.K.; Irvine, L.; Elliott, L.; Wallace, H. Closing the Health Inequalities Gap: An International Perspective; No. EUR/05/5048925; WHO Regional Office for Europe: Copenhagen, Denmark, 2005. [Google Scholar]
  23. Spinakis, A.; Anastasiou, G.; Panousis, V.; Spiliopoulos, K.; Palaiologou, S.; Yfantopoulos, J. Expert Review and Proposals for Measurement of Health Inequalities in the European Union—Full Report; European Commission Directorate General for Health and Consumers: Luxembourg, 2011. [Google Scholar]
  24. Barreto, M.L. Health inequalities: A global perspective. Ciência Saúde Coletiva 2017, 22, 2097–2108. [Google Scholar] [CrossRef]
  25. Pizot, C.; Dragomir, M.; Macacu, A.; Koechlin, A.; Bota, M.; Boyle, P. Global burden of pancreas cancer: Regional disparities in incidence, mortality and survival. J. Health Inequalities 2019, 5, 96–112. [Google Scholar] [CrossRef]
  26. Ozdenerol, E. Spatial Health Inequalities: Adapting GIS Tools and Data Analysis; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
  27. Casper, M.; Kramer, M.R.; Peacock, J.M.; Vaughan, A.S. Population Health, Place, and Space: Spatial Perspectives in Chronic Disease Research and Practice. Prev. Chronic Dis. 2019, 16, E123. [Google Scholar] [CrossRef] [PubMed]
  28. Gilliland, J.A.; Shah, T.I.; Clark, A.; Sibbald, S.; Seabrook, J.A. A geospatial approach to understanding inequalities in accessibility to primary care among vulnerable populations. PLoS ONE 2019, 14, e0210113. [Google Scholar] [CrossRef] [PubMed]
  29. Wang, Z.; Nie, K. Measuring Spatial Patterns of Health Care Facilities and Their Relationships with Hypertension Inpatients in a Network-Constrained Urban System. Int. J. Environ. Res. Public Health 2019, 16, 3204. [Google Scholar] [CrossRef] [Green Version]
  30. Chen, J.; Bai, Y.; Zhang, P.; Qiu, J.; Hu, Y.; Wang, T.; Xu, C.; Gong, P. A Spatial Distribution Equilibrium Evaluation of Health Service Resources at Community Grid Scale in Yichang, China. Sustainability 2020, 12, 52. [Google Scholar] [CrossRef] [Green Version]
  31. Wierzbicka, A. Taxonomic analysis of the Polish public health in comparison with selected European countries. Stat. Transition. New Ser. 2012, 13, 343–364. [Google Scholar]
  32. Bem, A.; Ucieklak-Jeż, P.; Siedlecki, R. The spatial differentiation of the availability of health care in polish regions. Procedia-Soc. Behav. Sci. 2016, 220, 12–20. [Google Scholar] [CrossRef] [Green Version]
  33. Shi, L.; Starfield, B.; Kennedy, B.; Kawachi, I. Income inequality, primary care, and health indicators. J. Fam. Pract. 1999, 48, 275–284. [Google Scholar]
  34. Hübelová, D.; Kozumpliková, A.; Jadaczková, V.; Rousová, G. Spatial differentiation of selected healyh factors of the South Moravian Region population. Geogr. Cassoviensis 2018, XII, 33–52. [Google Scholar]
  35. Górecki, T.; Krzyśko, M. Functional Principal Components Analysis. In Data Analysis Methods and Its Applications; Pociecha, J., Decker, R., Eds.; C.H. Beck: Munich, Germany, 2012; pp. 71–87. [Google Scholar]
  36. Górecki, T.; Krzyśko, M.; Waszak, Ł.; Wołyński, W. Selected statistical methods of data for multivariate functional data. Stat. Pap. 2018, 59, 153–182. [Google Scholar]
  37. Hanusz, Z.; Krzyśko, M.; Nadulski, R.; Waszak, Ł. Discriminant coordinates analysis for multivariate functional data. Commun. Stat. Theory Methods 2020, 49, 4506–4519. [Google Scholar] [CrossRef]
  38. Krzyśko, M.; Wołyński, W.; Ratajczak, W.; Kierczyńska, A.; Wenerska, B. Sustainable development of Polish macroregions-study by means of the kernel discriminant coordinates method. Int. J. Environ. Res. Public Health 2020, 17, 7021. [Google Scholar] [CrossRef] [PubMed]
  39. Krzyśko, M.; Łukaszonek, W.; Wołyński, W. Discriminant coordinates analysis in the case of multivariate repeated measures data. Stat. Transit. New Ser. 2018, 19, 495–506. [Google Scholar] [CrossRef] [Green Version]
  40. Walesiak, M.; Dudek, A. The Choice of Variable Normalization Method in Cluster Analysis. In Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development During Global Challenges; Soliman, K.S., Ed.; International Business Information Management Association: Seville, Spain, 2020; pp. 325–340. [Google Scholar]
  41. Jajuga, K.; Walesiak, M. Standardisation of data set under different measurement scales. In Classification and Information Processing at the Turn of the Millennium; Springer: Berlin/Heidelberg, Germany, 2000; pp. 105–112. [Google Scholar]
  42. Seber, G.A.F. Multiple Observations; John Wiley and Sons, Inc.: New York, NY, USA, 2004. [Google Scholar]
  43. Mirkin, B. Clustering: A Data Recovery Approach, 2nd ed.; Taylor and Francis Group, LLC.: London, UK, 2013. [Google Scholar]
  44. Krzyśko, M.; Wołyński, W.; Górecki, T.; Skorzybut, M. Learning Systems: Pattern Recognition, Cluster Analysis and Dimensional Reduction; Scientific and Technical Publishers: Warsaw, Poland, 2008. [Google Scholar]
  45. Krzyśko, M.; Smaga, Ł. A multivariate coefficient of variation for functional data. Stat. Its Interface 2019, 12, 647–658. [Google Scholar]
  46. Albert, A.; Zhang, L. A novel definition of the multivariate coefficient of variation. Biom. J. 2010, 52, 667–675. [Google Scholar] [CrossRef]
  47. Szymkowiak, M.; Młodak, A.; Wawrowski, Ł. Mapping poverty at the level of subregions in Poland using indirect estimation. Stat. Transit. New Ser. 2017, 18, 609–635. [Google Scholar] [CrossRef] [Green Version]
  48. Marchetti, S.; Beręsewicz, M.; Salvati, N.; Szymkowiak, M.; Wawrowski, Ł. The use of a three-level M-quantile model to map poverty at local administrative unit 1 in Poland. J. R. Stat. Soc. Ser. A 2018, 181, 1077–1104. [Google Scholar] [CrossRef]
  49. Roszka, W. Spatial microsimulation of personal income in Poland at the level of subregions. Stat. Transition. New Ser. 2019, 20, 133–153. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Administrative division of Poland—provinces and districts.
Figure 1. Administrative division of Poland—provinces and districts.
Ijerph 18 01109 g001
Figure 2. Average values of 8 variables calculated from functional data for districts included in each of the 16 provinces. Note: The ordinate axis shows the unitized values of a given variable.
Figure 2. Average values of 8 variables calculated from functional data for districts included in each of the 16 provinces. Note: The ordinate axis shows the unitized values of a given variable.
Ijerph 18 01109 g002
Figure 3. Plotted values of the first two functional discriminant coordinates.
Figure 3. Plotted values of the first two functional discriminant coordinates.
Ijerph 18 01109 g003
Figure 4. Weight functions for the first (left) and the second (right) functional discriminant coordinate.
Figure 4. Weight functions for the first (left) and the second (right) functional discriminant coordinate.
Ijerph 18 01109 g004
Figure 5. Dendrogram for 16 Polish provinces (the Ward method).
Figure 5. Dendrogram for 16 Polish provinces (the Ward method).
Ijerph 18 01109 g005
Figure 6. Spatial variation in the health situation across the provinces of Poland.
Figure 6. Spatial variation in the health situation across the provinces of Poland.
Ijerph 18 01109 g006
Table 1. List of variables used in analysis.
Table 1. List of variables used in analysis.
VariableDescriptionType of Variable
1Nurses and midwives per 10,000 populationS
2Doctors per 10,000 populationS
3Population per generally available pharmacyD
4Deaths of people due to cardiovascular disease per 100,000 populationD
5Total deaths due to cancer per 100,000 populationD
6Health out-patient departments per 10,000 populationS
7Number of doctors consultations per 10,000 populationS
8Infant deaths per 1000 live birthsD
Table 2. The composition of Polish provinces.
Table 2. The composition of Polish provinces.
NumberProvince NameNumber of Districts
1dolnośląskie30
2kujawsko-pomorskie23
3lubelskie24
4lubuskie14
5łódzkie24
6małopolskie22
7mazowieckie42
8opolskie12
9podkarpackie25
10podlaskie17
11pomorskie20
12śląskie36
13świętokrzyskie14
14warmińsko-mazurskie21
15wielkopolskie35
16zachodniopomorskie21
Total380
Table 3. Eigenvalues and related statistics.
Table 3. Eigenvalues and related statistics.
NumberEigenvalue% Total Variance% Cumulative Variance
148.268227.674527.6745
228.602416.399144.0735
326.346115.105559.1790
415.16118.692667.8716
511.62146.663174.5347
69.40125.390179.9248
78.54364.898484.8232
85.61553.219788.0429
94.98592.858790.9016
104.89822.808393.7099
113.41691.959195.6690
122.71111.554497.2234
132.03851.168798.3921
141.45760.835799.2279
151.34670.7721100.0000
Table 4. The mean values of the 16 provinces (first two).
Table 4. The mean values of the 16 provinces (first two).
NumberVariable 1Variable 2
1−1.0418−0.1272
2−0.9538−1.8010
32.18560.8277
4−0.2247−0.0520
5−0.30050.4795
61.04080.2084
70.4584−1.0042
80.92231.5262
92.5934−1.5963
100.53072.2876
11−1.04240.8810
12−0.78830.7896
133.06220.1767
14−1.60260.6644
15−1.5847−1.2004
16−0.97630.6872
Table 5. Values of coefficients of the vector weight functions.
Table 5. Values of coefficients of the vector weight functions.
First functional discriminant coordinate
Variable γ ^ 10 γ ^ 11 γ ^ 12 γ ^ 13 γ ^ 14 AreaArea (%)
12.5407−3.23892.0978-0.3165−0.93039.00398.2381
20.57927.8019−0.7947−7.334012.808334.960931.9876
30.2994−0.55822.45866.30650.106214.483713.2519
4−3.01430.30076.7926−1.9114−1.247015.126413.8399
53.83200.70270.1091−0.9202−0.76569.38648.5882
6−0.1795−3.5460−0.46911.3195−1.32988.75018.0059
70.95250.6116−6.65534.72781.719016.026414.6634
8−0.18020.65530.0063−0.19740.39501.55741.4249
Second functional discriminant coordinate
Variable γ ^ 20 γ ^ 21 γ ^ 22 γ ^ 23 γ ^ 24 AreaArea (%)
1−0.81132.10211.90107.79226.717223.173217.3879
21.3009−9.6903−5.79886.84253.594426.858120.1529
30.21595.03840.8509−7.67870.324318.857014.1493
4−1.458210.66980.1764−1.83558.955128.017821.0231
5−0.0163−2.8675−0.7596−0.2515−0.42816.56874.9288
60.75402.9261−1.33710.7459−2.10317.49065.6206
71.2655−0.17014.4019−4.0271−7.753120.745715.5665
8−0.5597−0.0570−0.2735−0.04840.36021.56041.1709
Table 6. Membership of provinces in the four clusters.
Table 6. Membership of provinces in the four clusters.
NumberProvinceCluster
1dolnośląskieI
2kujawsko-pomorskieII
3lubelskieIII
4lubuskieII
5łódzkieI
6małopolskieIII
7mazowieckieI
8opolskieIV
9podkarpackieIII
10podlaskieIV
11pomorskieII
12śląskieII
13świętokrzyskieIII
14warmińsko-mazurskieII
15wielkopolskieII
16zachodniopomorskieII
Table 7. Values of the multivariate functional coefficient of variation (MFCV).
Table 7. Values of the multivariate functional coefficient of variation (MFCV).
ProvincesMFCV
All0.3705
Cluster I0.2975
Cluster II0.2890
Cluster III0.1728
Cluster IV0.2047
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Krzyśko, M.; Wołyńki, W.; Szymkowiak, M.; Wojtyła, A. A Spatio-Temporal Analysis of the Health Situation in Poland Based on Functional Discriminant Coordinates. Int. J. Environ. Res. Public Health 2021, 18, 1109. https://doi.org/10.3390/ijerph18031109

AMA Style

Krzyśko M, Wołyńki W, Szymkowiak M, Wojtyła A. A Spatio-Temporal Analysis of the Health Situation in Poland Based on Functional Discriminant Coordinates. International Journal of Environmental Research and Public Health. 2021; 18(3):1109. https://doi.org/10.3390/ijerph18031109

Chicago/Turabian Style

Krzyśko, Mirosław, Waldemar Wołyńki, Marcin Szymkowiak, and Andrzej Wojtyła. 2021. "A Spatio-Temporal Analysis of the Health Situation in Poland Based on Functional Discriminant Coordinates" International Journal of Environmental Research and Public Health 18, no. 3: 1109. https://doi.org/10.3390/ijerph18031109

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop