Mapping the Burden of Hypertension in South Africa: A Comparative Analysis of the National 2012 SANHANES and the 2016 Demographic and Health Survey

This study investigates the provincial variation in hypertension prevalence in South Africa in 2012 and 2016, adjusting for individual level demographic, behavioural and socio-economic variables, while allowing for spatial autocorrelation and adjusting simultaneously for the hierarchical data structure and risk factors. Data were analysed from participants aged ≥15 years from the South African National Health and Nutrition Examination Survey (SANHANES) 2012 and the South African Demographic and Health Survey (DHS) 2016. Hypertension was defined as blood pressure ≥ 140/90 mmHg or self-reported health professional diagnosis or on antihypertensive medication. Bayesian geo-additive regression modelling investigated the association of various socio-economic factors on the prevalence of hypertension across South Africa’s nine provinces while controlling for the latent effects of geographical location. Hypertension prevalence was 38.4% in the SANHANES in 2012 and 48.2% in the DHS in 2016. The risk of hypertension was significantly high in KwaZulu-Natal and Mpumalanga in the 2016 DHS, despite being previously nonsignificant in the SANHANES 2012. In both survey years, hypertension was significantly higher among males, the coloured population group, urban participants and those with self-reported high blood cholesterol. The odds of hypertension increased non-linearly with age, body mass index (BMI), waist circumference. The findings can inform decision making regarding the allocation of public resources to the most affected areas of the population.


Introduction
Cardiovascular diseases (CVDs) present a major public health concern, contributing significantly to the global disease burden in both high-income countries and low-andmiddle income countries (LMICs). Many sub-Saharan African (SSA) countries have experienced dramatic increases in CVD burden and its associated risk factors over a short period. Demographic and socio-economic diversities exist in the SSA region, with various populations undergoing a rapid epidemiological and demographic transition [1][2][3]. SSA countries are facing multiple burdens of diseases, with the focus of infectious and communicable diseases including HIV/AIDS, tuberculosis and malaria being shifted towards the economic and health burden associated with non-communicable diseases (NCDs), mainly CVDs [1,2]. One such modifiable risk factor of CVD is hypertension, which is a major source of premature mortality globally [4]. In South Africa, hypertension is one of the most common NCDs, responsible for premature adult mortality [5].
The South African National Health and Nutrition Examination Survey (SANHANES), conducted in 2012, found that almost a third of South Africans aged 15 or older were hypertensive [6]. The 2016 Demographic Health Survey indicated hypertension prevalence in females at 46.0% and 44.0% in males [7]. In many LMICs higher than global average prevalence has been reported, attributed to treatment non-compliance, urbanization, and the behavioural risk factors such as poor diet, physical inactivity, and alcohol and tobacco use [8]. While South Africa remains one of the most unequal societies, the country had experienced rapid urbanization and income growth resulting in lifestyle, stress and dietary changes among all South Africans. The hypertension prevalence is now relatively high and has shown increases among all population groups [6,7]. Spatial variation in hypertension and its associated risk factors was examined in a previous study [9]. However, this was based on data older than two decades. In many high-income countries, substantial heterogeneity in disease prevalence and associated risk factors was indicated when data included geographic variations [10,11]. The mapping of both hotspot (high) and cold spot (low) areas for disease prevalence can be utilized to identify hypertension clustering, its variation by population groups, as well as contextual characteristics that include ethnic or racial residential segregation, poverty, healthcare accessibility, and other socio-economic factors [12,13]. The identification of disease prevalence hotspots is pivotal in the strengthening of disease management and prevention community-based efforts, so that resource allocation and the planning, delivery and policy of health services are informed [14,15].
Therefore, the aim of this study is to investigate the geographic variation in the prevalence of hypertension in South Africa at a province level in both 2012 and in 2016, adjusting for individual level demographic, behavioural and socio-economic variables. The analyses allow for spatial autocorrelation of the data while adjusting simultaneously for the hierarchical structure and various risk factors.

Study Population
Data were analyzed from participants aged 15 years or older from the South African National Health and Nutrition Examination Survey (SANHANES) of 2012 and the recent South African Demographic and Health Survey (DHS) of 2016. Both are national biobehavioral surveys of the non-institutionalized population of South Africa, that assess the health status of the population using interviews, medical examinations and biomarker analysis. Both surveys employed a multi-stage disproportionate stratified cluster sampling design.
In the 2012 SANHANES, 1000 census enumeration areas (EAs) from the 2001 population census were selected from a database of 86,000 EAs. The EAs were stratified by province and locality type, while race was used as an additional stratification variable in the formal urban areas. A total of 500 EAs representative of the socio-demographic profile of South Africa were selected and 20 households were randomly selected from each EA, yielding an overall sample of 10,000 households. All individuals residing in households were eligible to participate. A total of 8166 of the 10,000 households were valid, yielding 27,580 individuals of all ages who were eligible to be interviewed, of which 25,532 (92.6%) completed the interview. Of those interviewed, 12,025 (43.6%) volunteered to undergo a medical examination of which 7148 were ≥15 years. Details of SANHANES-1 methodology are reported elsewhere [6].
The sampling frame used for the 2016 DHS was the Statistics South Africa Master Sample Frame that was created using the Census 2011 enumeration areas (EAs). EAs of manageable size were considered as primary sampling units (PSUs), whereas small neighbouring EAs were pooled together to form new PSUs. Each of the nine provinces was stratified into urban, farm, and traditional areas, yielding 26 sampling strata. Overall 750 PSUs were selected from the sampling strata and 20 residential dwelling units were selected from each PSU. All individuals residing in the dwelling units were eligible to participate. A total of 13,288 of the 15,292 selected households were occupied, of which 11,083 (83%) individuals were interviewed and 82% of females and 77% of males aged ≥15 years had their blood pressure measured.

Outcome Measure
The primary outcome measure was hypertension defined as blood pressure ≥ 140/90 mmHg or self-reported health professional diagnosis or on antihypertensive medication. The blood pressure classification of 140/90 mmHg is based on the 2014 South African Hypertension Practice (SAHP) guidelines [16], which are the currently applied in South African health system. Self-reported diagnosis by a health professional was measured by participants' responses to whether a doctor, nurse, or other health professional had told them that they have high blood pressure. In both the SANHANES and DHS, three systolic and diastolic blood pressure measurements were taken from consenting individuals aged ≥ 15 years using Omron digital blood pressure monitors. Measurements were taken at intervals of 3 min or more in the DHS and after intervals of 5-10 min in the SANHANES. The mean of the second and third measures were used in the analyses.

Covariates
The main independent variable was the participants' geographic province of residence. Other covariates were various individual-level variables such as socio-demographics, health behaviours, and cardiovascular comorbidities that have been known to be associated with hypertension. Sociodemographic covariates used were sex, age, ethnicity (black/African, 'coloured' (refers to mixed race ancestry), white and Asian/Indian), education level respondent (no education vs. primary, secondary and higher education), and wealth index (categorised into 5 quintiles). The ethnicity and racial categories used in this paper were constructed for South Africans during apartheid and prior to the current democratic era; but are used in national statistical reporting on population groups. The authors do not necessarily subscribe to its continued use. Anthropometric measures including height, weight, and waist circumference were measured. Body mass index (BMI) in kg/m 2 was categorized as: <25: Normal, 25-29.9: Overweight, and ≥30: Obese. Health behaviours were obtained by self-report of current smoking and current alcohol use. Cardiovascular comorbidities comprised self-reported history of diabetes, high blood cholesterol, coronary heart disease (heart attack or angina), and stroke. Diabetes and high blood cholesterol were based on whether a health professional had informed the participant that they had these conditions. Environmental factors were participants' rural or urban area of residence.

Statistical Analysis
Separate analyses were performed on each national survey dataset to investigate the association of various socio-economic factors on the prevalence of hypertension across the nine provinces of South Africa ( Figure 1) while controlling for the latent effects of geographical location. Due to the multistage cluster sampling techniques employed in these surveys, a structured component is introduced [17] and it is no longer valid to assume that the data are independent as they are often inherently hierarchical and spatially autocorrelated. Thus, there is a need for more novel statistical analytical approaches that explicitly allow for spatially autocorrelated response while simultaneously accounting for linear and non-linear covariates and other potential sources of random errors in the data as well as data hierarchy.
linear and non-linear covariates and other potential sources of random errors in the data as well as data hierarchy. Specifically, for our purposes, we utilize Bayesian geo-additive regression modelling such that the outcome variable y defined on a binary scale indicates whether an individual is hypertensive or not. Thus, y is a realization of a Bernoulli distribution with the probability mass function (PMF) ; = 1 − for y ∈ {0,1}, where p is the probability of success, that is, the probability that a randomly selected individual is hypertensive. Then, y is allowed to depend on a set of key covariates through a geoadditive predictor η linked to a function of its mean with an appropriate link function h(μ) as given in Equation (1), that is, η = Gender + Ethnicity +⋯+ angina + f(Age) + f(BMI) + f_spat (spat) (1) where f(.) are non-parametric (smooth) functions for the non-linear effects of the continuous covariates such as Age, BMI, and Waist circumference. In addition, f_spat (.) represents the spatial random effect which account for potential spatial autocorrelation. For ease of exposition, henceforth, we shall refer to the model specified in Equation (1)

Descriptive Analyses
Characteristics of the two study populations are presented in Table 1, disaggregated by hypertensive status. The prevalence of hypertension was 38.4% in the SANHANES in Specifically, for our purposes, we utilize Bayesian geo-additive regression modelling such that the outcome variable y defined on a binary scale indicates whether an individual is hypertensive or not. Thus, y is a realization of a Bernoulli distribution with the probability mass function (PMF) f (y; p) = p y (1 − p) 1−y for y ∈ {0,1}, where p is the probability of success, that is, the probability that a randomly selected individual is hypertensive. Then, y is allowed to depend on a set of key covariates through a geoadditive predictor η linked to a function of its mean with an appropriate link function h(µ) as given in Equation (1), that is, where f(.) are non-parametric (smooth) functions for the non-linear effects of the continuous covariates such as Age, BMI, and Waist circumference. In addition, f_spat (.) represents the spatial random effect which account for potential spatial autocorrelation. For ease of exposition, henceforth, we shall refer to the model specified in Equation (1)

Descriptive Analyses
Characteristics of the two study populations are presented in Table 1, disaggregated by hypertensive status. The prevalence of hypertension was 38.4% in the SANHANES in 2012 and 48.2% in the DHS in 2016. In both survey years, hypertensive individuals were, on average, significantly older, more likely to be white or coloured, more likely to be living in urban areas, to be obese, and to report previous history of diabetes, high blood cholesterol, coronary heart disease and stroke (Table 1).

Geographic Variation in Hypertension
Crude estimates of prevalence of hypertension across the nine provinces in South Africa show geographical variations in prevalence for both datasets (Figure 2). In 2016 (DHS), the highest prevalence of hypertension was found in the coastal provinces of KwaZulu-Natal and the Eastern Cape, followed by Free State and Mpumalanga. Western Cape and Limpopo had the lowest prevalence in 2016 and this represents a dramatic decrease in prevalence for the Western Cape since 2012 (Figure 2b). However, prevalence remained high in KwaZulu-Natal for both surveys, whilst Limpopo remained consistently low in both surveys.

Geographic Variation in Hypertension
Crude estimates of prevalence of hypertension across the nine provinces in South Africa show geographical variations in prevalence for both datasets (Figure 2). In 2016 (DHS), the highest prevalence of hypertension was found in the coastal provinces of Kwa-Zulu-Natal and the Eastern Cape, followed by Free State and Mpumalanga. Western Cape and Limpopo had the lowest prevalence in 2016 and this represents a dramatic decrease in prevalence for the Western Cape since 2012 (Figure 2b). However, prevalence remained high in KwaZulu-Natal for both surveys, whilst Limpopo remained consistently low in both surveys.  Table 2 presents estimates of the Posterior Odds ratio (POR) along with the corresponding measure of uncertainty-the 95% credible interval, from the various models fitted to the datasets, and the corresponding deviance information criterion (DIC) for model selection. Models with smaller DIC were retained as providing the better fit. The posterior odds ratio (POR) maps of the nine provinces in conjunction with the corresponding significance maps indicate that the risk of hypertension is significantly high in KwaZulu-Natal and Mpumalanga based on the 2016 DHS data, despite being previously nonsignificant in 2012 (Figures 3 and 4).  Table 2 presents estimates of the Posterior Odds ratio (POR) along with the corresponding measure of uncertainty-the 95% credible interval, from the various models fitted to the datasets, and the corresponding deviance information criterion (DIC) for model selection. Models with smaller DIC were retained as providing the better fit. The posterior odds ratio (POR) maps of the nine provinces in conjunction with the corresponding significance maps indicate that the risk of hypertension is significantly high in KwaZulu-Natal and Mpumalanga based on the 2016 DHS data, despite being previously nonsignificant in 2012 ( Figures 5 and 6).

Region of residence (province)
See maps ( Figure 5) See maps ( Figure 6)        Figure 3, the central white patch (Lesotho) is excluded from the map. In Figure 3a,b, dark blue to yellow correspond to low risk to high risk provinces. In Figure 3c,d, black colour corresponds to significantly high risk regions; white colour corresponds to significantly low risk regions; and grey colour correspond to regions where the risks are not statistically significant. Note that in Figure 5, the central white patch (Lesotho) is excluded from the map. In Figure 5a,b, dark blue to yellow correspond to low risk to high risk provinces. In Figure 5c,d, black colour corresponds to significantly high risk regions; white colour corresponds to significantly low risk regions; and grey colour correspond to regions where the risks are not statistically significant. Table 2 shows that males are significantly more likely to become hypertensive than women as affirmed across the two datasets. The risk of hypertension showed positive correlation with age such that the older one gets, the higher the chances of becoming hypertensive (Figures 3 and 4). Estimates based on the 2016 DHS show that South Africans who identify as coloured are about 1.67 times significantly more likely to be hypertensive than those who identify as Black/African. While the odds of being hypertensive were higher among less educated individuals, these differences were not statistically significant. People who lived in urban areas are more likely to become hypertensive than those who lived in rural areas. Additionally, the risk of hypertension increased with BMI and waist circumference (Figures 3 and 4), as well as with having high blood cholesterol. Blue to red correspond to low risk to high-risk provinces. Note that in Figure 3, the central white patch (Lesotho) is excluded from the map. In Figure 3a,b, dark blue to yellow correspond to low risk to high risk provinces. In Figure 3c,d, black colour corresponds to significantly high risk regions; white colour corresponds to significantly low risk regions; and grey colour correspond to regions where the risks are not statistically significant. Table 2 shows that males are significantly more likely to become hypertensive than women as affirmed across the two datasets. The risk of hypertension showed positive correlation with age such that the older one gets, the higher the chances of becoming hypertensive ( Figures 5 and 6). Estimates based on the 2016 DHS show that South Africans who identify as coloured are about 1.67 times significantly more likely to be hypertensive than those who identify as Black/African. While the odds of being hypertensive were higher among less educated individuals, these differences were not statistically significant. People who lived in urban areas are more likely to become hypertensive than those who lived in rural areas. Additionally, the risk of hypertension increased with BMI and waist circumference ( Figures 5 and 6), as well as with having high blood cholesterol. Blue to red correspond to low risk to high-risk provinces. Note that in Figure 5, the central white patch (Lesotho) is excluded from the map. In Figure 5a,b, dark blue to yellow correspond to low risk to high risk provinces. In Figure 5c,d, black colour corresponds to significantly high risk regions; white colour corresponds to significantly low risk regions; and grey colour correspond to regions where the risks are not statistically significant.

Discussion
South Africa has the largest measured hypertension prevalence in Southern Africa [5]. While many studies have focused on socio-demographic factors, geographic variation has been minimal or absent. This study therefore aimed to map the burden of hypertension in South Africa by conducting a comparative analysis of the 2012 SANHANES and 2016 DHS surveys.
It is evident from this study that hypertension is a significant public health issue in South Africa. The present study highlights both crude and adjusted estimates across the nine provinces indicating geographic variations in hypertension prevalence for both the 2012 SANHANES and 2016 DHS surveys. This study also found consistent associations with various emerging and traditional hypertension risk factors. Specifically, from the DHS dataset, age, being a drinker and smoker, excess body weight, as well as the major CVD comorbidities such as high blood sugar, high blood cholesterol, angina, and stroke were associated with a higher hypertension prevalence. From the SANHANES dataset, specifically age, excess body weight, being a current smoker, and the major CVD comorbidities were associated with a higher hypertension prevalence. This is indicative that these risk factors have continuously contributed to the exacerbation of hypertension prevalence over time. This may be due to the aging of populations and rapid urbanization, that leads to changes in their diet and lifestyle [18]. Similar patterns of association with comorbidities, smoking and alcohol drinking were observed in a previous study by Kandala et al. [9] on geographic variation in hypertension using South African data from 1998.
Estimates from both datasets indicated that South Africans who identify as coloured, those living in urban areas and have lower formal education levels were more likely to be hypertensive. This is consistent with the literature, as previous studies have indicated that having a better knowledge of hypertension was associated with improvement in compliance of antihypertensive treatment and the subsequent control of hypertension [19,20]. Rapid urbanization has resulted in populations having elevated levels in psychosocial stress which, coupled with lifestyle and dietary changes, have increased their risk of hypertension or exacerbated the disease [18].
Regarding geographic variation at a province level, after adjusting for all covariates, KwaZulu-Natal and Mpumalanga indicated significantly higher hypertension prevalence in 2016. Notably, only 55.5% of KwaZulu-Natal's census enumeration areas were classified as urban and Mpumalanga is largely rural. While this statistical approach detects the geographic patterns of hypertension prevalence, further research is needed to explain the differences observed. One such reason could be that KwaZulu-Natal has among the highest obesity rates and obesity is a major risk factor hypertension. It could also be posited that the nutrition transitions, with their resultant changes in dietary and physical activity patterns, have progressed at a faster rate between 2012 and 2016 in these provinces than the other provinces. Notably, several districts in KwaZulu-Natal experienced very high hypertension incidence rates in 2015/2016 [21]. These findings can complement other studies on chronic disease mapping and inform public health policy and health educational programs at both national and provincial levels. The findings will also inform decision making regarding the allocation of public resources and funds to the most affected areas of the population. In addition, these provinces need to be closely monitored over time as they are identified high risk regions.
Regarding the statistical approach used in this paper, there are many potential advantages of this approach over more conventional approaches like discrete-time Cox models with time varying covariates and fixed or random districts effects; or standard 2-level multilevel modelling with unstructured spatial effects [22,23]. In the conventional models, it is assumed that the random components at the contextual level (province in our case) are mutually independent. In practice, these approaches specify correlated random residuals (see, for instance [24]), which is contrary to the assumption. Further, Borgoni and Billari [25] point out that the independence assumption has an inherent problem of inconsistency. They argue that if the location of the event matters, it makes sense to assume that areas close to each other are more similar than areas that are far apart. Moreover, treating groups (in our case districts) as independent is unrealistic and lead to poor estimates of the standard errors. As Rabe-Heskesth and Everitt [26] pointed out, standard errors for between-district factors are likely to be underestimated because we are treating observations from the same districts as independent and thus increasing the apparent sample size.
On the contrary, standard errors for within district factors are likely to be overestimated [27]. On the other hand, Demographic and Health Survey data are based on a cluster random sample of districts which, in turn, introduces a structured component. Such component allows us to borrow strength from neighbors in order to cope with the posterior uncertainty of the district effect and obtain estimates for areas that may have inadequate sample sizes or are not represented in the sample. In an attempt to highlight the advantages of our approach in a spatial context and examine the potential bias incurred when ignoring the dependence between aggregated spatial areas, we fit several models with and without the structured and random components.
The study is subject to some limitations. Firstly, the cross-sectional nature of the data limits interpretations of temporality and causality. Secondly the health behaviour and CVD comorbidity variables are self-reported, which are subject to social desirability and recall bias. Further, a known diagnosis of hypertension or other morbidities may influence self-reports of risk behaviors or vice versa. CVD comorbidity variables are self-reported because HbA1c was measured in subsamples of participants and blood cholesterol was only measured in a subsample of SANHANES, which limited the use of these measures to assess blood cholesterol and diabetes. It is often difficult to accurately measure CVD comorbidity in national population-based surveys. Thirdly, there was limited information on dietary habits and physical activity, which are important risk factors for hypertension, in one or both surveys. Fourth, blood pressure measurements were not taken at the same time of the day for all participants due to the large numbers of participants surveyed. Notably, between South Africa's Censuses of 2001 and 2011, the country's provincial demarcations underwent minor changes at provincial and municipal boundaries. While these changes affected small percentages of land area in some provinces, they may have minor impact on the interpretation of the results. We, therefore, suggest these limitations must be taken into account while interpreting the results of this study.

Conclusions
The provinces KwaZulu-Natal and Mpumalanga indicated significantly higher hypertension prevalence in 2016. These provinces need to be monitored over time as they are identified as high risk. In both survey years, hypertension was significantly higher among participants who were male, from the coloured population group and from urban areas. The prevalence of hypertension increased non-linearly with age, BMI, waist circumference, and blood cholesterol. The findings can and inform public health policy and health educational programs at both national and provincial levels, and inform decision making regarding the allocation of public resources to the most at-risk regions of the population, with the aim of achieving more significant reduction of the scourge of hypertension and related issues in South Africa.  Informed Consent Statement: Informed consent was obtained from all participants involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The SADHS data are available on request from the DHS website https: //dhsprogram.com/data/dataset/South-Africa_Standard-DHS_2016.cfm?flag=0 (accessed on 15 June 2019) and the SANHANES data are available on request from http://datacuration.hsrc.ac.za/ (accessed on 15 June 2019).