Hypertension is a chronic medical condition in which the blood pressure in the arteries is elevated, and this condition is classified into two categories: primary hypertension and secondary hypertension. Between 90% and 95% of cases are categorized as primary hypertension, which implies high blood pressure with no obvious underlying medical causes [1
]. The World Health Organization has identified hypertension as the leading cause of cardiovascular and cerebrovascular mortality and the World’s most common chronic disease, as hypertension is a major risk factor for strokes, myocardial infarctions, heart failures and arterial disease. The treatment of hypertension and its complicating diseases leads to heavy consumption of medical and social resources. The American Heart Association estimated that the projected total costs of high blood pressure will be $91.4 billion in 2015 [2
]. There are many risk factors for hypertension disease, including age, race, family history, being overweight or obese, not being physically active, using tobacco, high-salt diet, too little vitamin D and potassium in diet, excessive alcohol use, stress and certain chronic conditions [3
]. Besides, the risk of having hypertension can vary in regions depending on their environmental conditions and socioeconomic position [5
In China, the prevalence of hypertension has continuously increased during the past fifty years. According to the 2010 Chinese guidelines for the management of hypertension, from 1991 to 2002, the awareness of hypertension increased from 26.3% to 30.2%, the treatment rate rose from 12.1% to 24.7% and the control rate grew from 2.8% to 6.1% [7
]. However, these rates are relatively low compared to developed countries. This report reveals that over 130 million of people with hypertension are unaware of their condition and that at least 30 million people are aware of their hypertension but do not receive any medical treatment. Indeed, over 75% of people who are aware that they have hypertension do not adequately control it [7
Recent hypertension studies on China mainly focus on lifestyle modifications, prevention, the impact on health-related quality of life and medical treatment [8
]. Limited epidemiological studies have attempted to describe the spatial variation of the hypertension disease over a large area. Previous works have indicated that neighborhood walkability, food availability, safety, and social cohesion may be mechanisms that link neighborhoods to hypertension [12
] and the number of hypertension admission patients largely depends on the general hypertensive population. In this study, we attempted to explore the spatial variation of the hospital admissions for hypertension throughout Shenzhen in 2011. The classic statistic in epidemiology is the standardized ratio (SR) which is commonly used to represent disease risk across a geographical area [13
] in order to identify those regions with higher or lower disease risk, being useful for capture regional changes. For each region, the standardized ratio is expressed as a relative value between the number of observed cases and the number of expected cases, as estimated by the national disease rate, with or without adjustment for socioeconomic and demographic variables [15
Although the standardized ratio is a useful tool in disease mapping research, it has some problems. An inevitable problem is spatial autocorrelation, which is an idea often attributed to geographer Waldo Tobler. Measuring the spatial pattern of feature values is based on the notion that things that are near to each other are more alike than things that are far apart. In geographical research, the study area is often delineated by artificial boundaries for measurement or administrative purposes [16
]. However, a spatial process in an area has an interaction with neighbors outside these boundaries and adjacent areas usually have similar attributes. In addition, the standardized ratio is a deficient estimator because it depends greatly on the population of each area. Usually, sparsely populated areas with few (or zero) cases can generate extreme values [13
]. Because the administrative divisions depend on population size, sparsely populated areas are often larger than densely populated areas, and furthermore, they tend to dominate the map visually even though they produce the least precise risk estimates [13
]. Moreover, shortcomings in the census data can generate defective risk estimation. For example, rapid population growth since the previous census would cause overestimated risks in a study area [17
To tackle the spatial dependence and inaccurate estimation of the standardized ratios, many methods have been employed to describe and assess the amount of true spatial variation of disease risk [15
]. A disease outbreak can be considered as a geographic process that is highly correlated to a specific geographic location and the corresponding conditions. In GIScience, we analyze the geographic processes for two reasons: first, we seek to predict the likelihood that something will occur in a place [18
]; second, we wish to identify the underlying factors [21
]. The relationships between various attributes of the spatial data can be defined as a model, which could become quite complex and time-consuming. A Bayesian estimation approach was used to analyze small area diabetes prevalence in the US [23
]. For this study, we employ a model-based relative risk estimation method based on hierarchical Bayesian models to assess the true spatial heterogeneity of the relative risk for hypertension admissions. These models are widely used for risk smoothing in disease mapping and have been described in detail by previous works [24
]. The basic principle of Bayesian methods is that uncertain data can be strengthened by combining them with prior information [14
]. Such estimates are a compromise between the local value of the standardized ratio and either the mean value for the map as a whole, or some local mean [13
]. The distribution for the spatial components in these models is discussed in [27
]. With covariate information and spatial components, models based on Bayesian statistics provide a more accurate estimation of the relative risk of each sub-district. In addition, methods of spatial statistics and analysis were applied in this study to identify and map spatial patterns.
Another important topic is the analysis scale, which is often known as modifiable areal unit problem [28
]. The analysis scale includes the size of the units in which phenomenon are measured and the size of the units into which measurements are aggregated for data analysis and mapping [30
]. To study a phenomenon accurately, it has been suggested that the analysis scale must match the actual scale of the phenomenon [31
]. However, this issue can become quite difficult, especially in unfamiliar cases. Traditionally, geographers analyze phenomena in geographical units that are as small as possible [14
], which results in difficulties and high expenses for the data collection. Furthermore, the choice of the analysis scale is often dictated by the availability of data, and because of sparse data, there will often be a tradeoff between homogeneity within small geographic units and the precision of risk estimates [13
]. Because of the availability of census data, the study is performed at the sub-district level, even though smaller geographical units existed in the study area.
In this work, we explored the spatial heterogeneity of the relative risk for hypertension admissions throughout Shenzhen in 2011 and attempted to address the drawbacks of the standardized ratio in disease mapping. Spatial statistical techniques and methods based on hierarchical Bayesian models were utilized in this study, and both covariate information and random components were employed in these models. After smoothing the relative risk of hypertension, a stable standardized ratio was acquired in each sub-district to highlight those sub-districts that have elevated or lowered relative risk. Our study aimed to identify some specific regions with high relative risk for hypertension admissions, and this information is useful for the Shenzhen City’s health administrators to improve the quality of hospital-based services for hypertension patients.
4. Discussion and Conclusions
In recent years, researchers have applied statistical techniques and spatial analysis to study the spatial variation of hospital admissions for hypertension disease. To that end, a Local Moran’s I index analysis and geographically weighted regression were used to investigate the patterns in standardized ratios of cardiovascular disease [48
]; furthermore, the risk factors for hypertension was examined using Kaplan-Meier methods and Cox proportion hazards models [49
]; in addition, the spatial scan statistics were used to detect clusters of high or low prevalence of overweight people or people with hypertension in rural South Africa [50
]. In this study, spatial scan statistics from SaTScan 9.1 were utilized to spot clusters of the relative risk of hospital admissions for hypertension in Shenzhen, spatial statistical techniques from ArcGIS 10.0 were adopted to identify patterns in the standardized ratio and methods based on Bayesian statistics were used to smooth the relative risk in a small-area disease mapping.
Moran’s I index compares the value for each feature in the pair to the mean value for the dataset rather than directly comparing the attribute values off neighboring features to each other. If the average difference between neighboring features is less than the average between all features, the values of the features are clustered. The results of Moran’s I index demonstrated that the spatial autocorrelation was positive in this geographical context, and the spatial dependence of nearby observed cases should be included in the modeling of estimates of the relative risk for hypertension admissions. Then, the Getis-Ord General G-statistic was used in this study to measure the concentration of values. One of the disadvantages of the Getis-Ord General G-statistic is that the results are highly dependent on the size of the features being analyzed. When large areas tend to have low values and smaller areas tend to have high values, even if the concentrations of highs and lows are equally distributed, the G-statistics will indicate that the high values are concentrated. Because the study area is often divided based on the population size and is delineated by the administrative boundaries, this tendency is especially significant when studying geographical phenomena and will lead to bias in analyzing and mapping the high-value clustering.
In hot spot analysis, we applied the Gi*
statistic because it included the value of the target feature that affected the occurrence of the clusters. Apart from the hot spot and cold spot, the Gi*
values of the rest of the areas were not statistically significant, which means there was no apparent concentration of either high or low standardized ratio surrounding these areas, and this usually happened either when the surrounding standardized ratio was near the mean or when the target sub-district was surrounded by a combination of high and low standardized ratios. The local statistic works best for identifying high-value clusters when there is no measureable pattern of clustering or dispersion across the study area [51
Then, the spatial scan statistics were applied to spot clusters of the relative risk. The geometry of the area being scanned, the probability-distribution-generating events under the null hypothesis, and the shapes and sizes of the scanning window are the three basic properties of the scan statistic [52
]. The methods of the probability approximations and Monte Carlo-based hypothesis testing are applied in the models of the spatial scan statistics, and the local G-statistic uses a neighborhood based on either adjacent features or a set distance. According to the results of a purely spatial scan analysis, the primary and secondary clusters were statistically significant. However, Zhang noted that spatial scan statistics and the local statistic can neither directly incorporate ecological covariates nor account for overdispersion [53
To tackle the spatial dependence and overdispersion of the standardized ratios, hierarchical Bayesian models were applied in this study. In these models, the standardized ratio was smoothed locally towards the mean ratio in the set of adjacent sub-districts. In small-area disease mapping, estimations of the relative risk are often inaccurate because the population is usually small in the analysis unit. From Table 5
, it is clear that the smoothing was greater for the least-stable estimates where the expected number of cases was small. Further research should be conducted in these areas because the larger areas tend to dominate the map visually, even though they produce the least-precise risk estimates.
Although multilevel spatial grids were obtainable, the study was performed at the sub-district level because the available census data were aggregated at this level. However, a disadvantage is that the analysis scales used in many geographical studies are arbitrary and modifiable. For example, the census data may be aggregated into sub-districts, postcode areas, police precincts or any other spatial partition, which affects the analysis’s results. The pattern created by a set of features and attributes may change depending on the scale. Because of the availability of data and the restrictions of research funding, in our study, we specified the sub-district spatial grid as the analysis scale as this scale was capable of describing the spatial variation of the relative risk of hospital admissions for hypertension in Shenzhen. The high-value cluster pattern was statistically significant in the observed cases count and in the relative risk in the neighboring sub-districts, which indicated that the spatial autocorrelation was positive and that the spatial dependence should be included in modeling the relative risk. One of the major contributions of this study was highlighting those sub-districts where the relative risk of hospital admissions for hypertension was concentrated, and the improvement of public health services should be addressed in these areas. Further work on data collection should target the smaller geographical units.
A summary of the top ten sub-districts with significant smoothing; the rank is specified from high to low.
A summary of the top ten sub-districts with significant smoothing; the rank is specified from high to low.
|Sub-district||SR||Smoothing SR||Rank of Expected Cases||Rank of Area|
Another disadvantage of our method is the boundaries of sub-districts, which are a reflection of administrative needs rather than the actual spatial distribution of epidemiological factors [14
]. As a result, these boundaries can lead to an inaccurate interpretation of the spatial variation of the relative risk across the study area. Furthermore, the study area is delineated by these artificial boundaries: the realistic process continues beyond the area because it has an interaction with the neighbors outside these borders. Because the calculations are usually based on the spatial neighborhood around each feature, certain spatial statistical techniques may require data on variables that refer to spatial units beyond the boundary of the study area. If these boundary data are not available, this shortcoming represents a form of data incompleteness [16
] unless there are fixed natural barriers that would minimize any influence from the surrounding features, as in the case of an island where the coastal boundary affects the spread of some diseases. How the boundary is handled and how to define spatial neighborhoods and weights is a hot topic in spatial analysis and statistics. Some solutions to this problem have been proposed in previous works [16
]. However, regardless of how the boundary is defined, the features near the edge of the study area will still have fewer neighbors than the features in the center of the study area. In our study, we concluded that the boundary’s effect on the Gi*
statistic did not lead to an underestimate problem because each hot spot or cold spot was identified by comparing the local sum to the expected local sum, and furthermore, the differences in the number of neighbors will not impact the result.
We attempted to identify those factors that are associated with the spatial distribution of hospital admissions for hypertension. Previous works have demonstrated the relationship between the prevalence of hypertension, and socioeconomic measures, environmental variables and neighborhood characteristics [5
]. In [56
], the researchers aimed to investigate the association of aircraft noise with risk of chronic diseases in the general population and concluded that high levels of aircraft noise also associated with an increased risk of stroke, coronary heart disease and cardiovascular disease. In [57
], the objective was to investigate whether exposure to aircraft noise increases the risk of hospitalization for cardiovascular diseases in older people residing near airports and the results showed that there was a statistically significant association between exposure to aircraft noise and risk of hospitalization for cardiovascular diseases among older people living near airport. In our study, the results revealed that the road density, which was an indirect factor, can be applied in modeling the spatial variations of the relative risk. In [58
], the results indicated that there were ethnic differences in clinical trials and in routine care for diabetes patients in South Asian. Thus, detailed information is necessary in a geographic correlation study, which is usually conducted at a more local or small-area scale, resulting in a demand of large amounts of data. In acquiring the traffic noise level in each neighborhood as a direct factor, noise-dispersion models and manual noise assessments could be used [55
The results of the hierarchical Bayesian model showed that the relative risk of hospital admissions for hypertension was not homogeneous throughout Shenzhen. The high-value clustering was significant in the south and southeast of Shenzhen, which can be applied as a guideline for the establishment of hospital-based health services. However, there was an obvious underestimation in this study because of the lack of awareness of hypertension. Furthermore, Shenzhen is not yet facing a serious aging problem compared with other large Chinese cities, and hence, Shenzhen has a relatively low prevalence of hypertension and admission rate. In addition, the census data suffer drawbacks that result from the policy of “Hukou” (residents who hold a formal household registration in Shenzhen), and the population of Shenzhen is unique because the majority of its residents are migrant workers. Because there is a strong connection between this population group and their hometowns and families, they tend to support the elderly, who may suffer hypertension, which increases the number of hypertension admission cases. The main objective of this study was to improve the estimates of the spatial variation of the relative risk and identify a hot spot for public health services. Further work is necessary to amend the model and explain the spatial heterogeneity of the relative risk that is explored in this study.