Using Prospective Methods to Identify Fieldwork Locations Favourable to Understanding Divergences in Health Care Accessibility

: Central to this article is the issue of choosing sites for where a ﬁeldwork could provide a better understanding of divergences in health care accessibility. Access to health care is critical to good health, but inhabitants may experience barriers to health care limiting their ability to obtain the care they need. Most inhabitants of low-income countries need to walk long distances along meandering paths to get to health care services. Individuals in Malawi responded to a survey with a battery of questions on perceived difﬁculties in accessing health care services. Using both vertical and horizontal impedance, we modelled walking time between household locations for the individuals in our sample and the health care centres they were using. The digital elevation model and Tobler’s hiking function were used to represent vertical impedance, while OpenStreetMap integrated with land cover map were used to represent horizontal impedance. Combining measures of walking time and perceived accessibility in Malawi, we used spatial statistics and found spatial clusters with substantial discrepancies in health care accessibility, which represented ﬁeldwork locations favourable for providing a better understanding of barriers to health access.


Introduction
Access to health care is critical to good health, but inhabitants may nevertheless experience barriers to health care limiting their ability to obtain the care they need. Barriers to health care access can be a spatial factor such as long travel distance or other nonspatial factors such as affordability, appropriateness, and accommodation [1]. For a health project focusing on accessibility to health facilities in Malawi, we explore both perceived difficulties in accessing health care (non-spatial) and measured (spatial) access to health services. We hypothesize that there is a general relationship between perceived difficulties in accessing health services and measured accessibility: When measured accessibility is good, the perceived accessibility is good as well, and when measured accessibility is poor, the perceived accessibility is poor as well. This assumption has previously been confirmed by research that demonstrates a strong negative relationship between choice of health facility and the distance from where the patient is resident to the facility (e.g., [2]). If the assumption is true, the observations should be found close to a trend line and their vertical deviations (or residuals) from the line should be small. However, Casas et al. [3] compared potential versus revealed access to care in Colombia and found that the closest healthcare centre was rarely the patient's choice, and that travel time is heavily influenced by income. Patients from wealthy neighbourhoods were more likely to travel longer to receive health care, a phenomenon by Akin and Hutchinson [4] call 'bypassing'. Some other patients may perceive difficulties in accessing health care (for instance, due to poor service), but they may not have any alternative places to go or may not be able to go elsewhere.

Research Aim
An overarching research aim we are pursuing is to investigate why some households have low values on measured accessibility (good measured accessibility) but have high values on perceived accessibility (poor perceived accessibility), and why some households have high values on measured accessibility (poor measured accessibility) but have low values on perceived accessibility (good perceived accessibility). The aim of this article is narrower, namely, to develop a methodology that identifies clusters of households that fall into either of these divergence categories. We believe these clusters represent interesting fieldwork locations where further qualitative analysis may reveal knowledge important to understand barriers to health care accessibility.

Prospecting
The standard approach for conducting fieldwork broadly involves a sequence of selecting and entering the field, gathering and recording data, and leaving the field [5]. Central to this article, is the issue of choosing where to perform fieldwork. The use of geographic information systems (GIS) may help in choosing the areas where to do fieldwork [6], using a sampling or a prospecting procedure [7]. Sampling should be used when the study area is too large to be investigated in its totality and you need to select one or more venues randomly from potentially homogenous sites [8]. Prospecting should be used when you want to increase the probability of having a venue where the fieldwork will prevail some new knowledge, and you need to increase the probability that the targeted site can provide this information. Prospecting is common within archaeology as there is a need to minimize fieldwork venues to a minimum, that is, to where settlements of ancient population are most likely found. Archaeological sites tend to be found in environments with specific characteristics [7], and prospecting models study the environmental differences between areas with and without archaeological sites in order to identify areas where the probability of an archaeological site location is higher.

Measuring Accessibility
Accessibility is typically measured in GIS by either using a variant of the floating catchment area method or by using travel distance/time. The floating catchment area (FCA) method defines the service area of physicians by a threshold travel time, while accounting for the availability of physicians by their surrounded demands [9]. The twostep floating catchment area method (2SFCA), first proposed by Radke and Mu [10] and later modified by Luo and Wang [9] calculated the physician-to-population ratio in two steps. The first step assigns an initial ratio to each catchment (or service area) centred at physician locations, and the second step sums up the initial ratios in the overlapping service areas where residents have access to multiple physician locations [11]. However, the 2SFCA is limited in that it assumes that all population locations within the catchment have equal access and disregards the distance impedance within the catchment [12]. To remedy these shortcomings, the enhancement to the 2SFCA method uses weights to differentiate travel time zones, in both steps thereby accounting for distance decay [11]. A three-step FCA method is later developed considering that people's demand for a medical site will decrease when adjacent sites are also available [12]. To be able to measure accessibility to any of the FCA methods, one needs three indicators: Population demand, which is the number of people who will potentially need healthcare and is represented by the population in a geographic unit, supply and capacity of health care represented by the number of physicians or the number of beds, and a measure of spatial separation which can be shown as a distance, travel time or travel cost [13].
Travel distance and travel time are thus only one of the three indicators needed for a FCA method, and is typically estimated using one of three approaches: (1) Using straight line/Euclidean distance, (2) using a vector-based approach (e.g., network analysis) or (3) using a raster-based approach (e.g., path analysis). The straight-line distance is appropriate when measuring travel time for airborne transport and is also strongly linked to the driving distance and driving time in Northern England [14]. However, numerous studies from Sub-Saharan Africa (SSA) point to the lack of statistical significance of Euclidean distance as an explanatory factor for the access and use of healthcare facilities and services and subsequent health outcomes [15]. This may be due to the fact that the Euclidean distance between the household and health service facility fails to represent the real distance travelled and fails to represent travel time [16], and/or it may be due to the fact that there are other factors than spatial distance that influence accessibility to health care facilities in SSA.
Although travel time along irregular road networks is recognized as a more appropriate measure than straight-line distance [17], applications of such methods, in developing countries, remains constrained by lack of data [18]. However, with the advancement of OpenStreetMap (OSM), this is likely to change (e.g., [19]).
A raster-based approach to calculate travel time involves the use of factors that represent the cost or impedance of moving from one cell to another. Tanser et al. [20] computed the walking time from every pixel to the nearest clinic using a horizontal cost surface only. The cost surface used consisted of roads with five different levels, tracks, areas between roads or tracks, and inaccessible areas. In a typical scenario in their walking models, individuals walked to the nearest track at 2 km/h, then along a track at 3 km/h and eventually walked to the nearest clinic at a maximum speed of 4 km/h [20]. Except for the representation of inaccessible areas (game reserves, large rivers, and dams), Tanser, Gijsbertsen, and Herbst [20] considered any other land use category equally. Paes et al. who used Tobler's hiking for cost functions for walking accessibility in infrastructure-poor regions, conclude their article with the prospect of including horizontal costs: 'It would also be interesting to examine the impact on accessibility of different land cover types, including the presence of potential barriers that must be crossed or circumvented (e.g., wetlands) or facilitators to travel (e.g., dirt trails or tracks on the terrain)' [21] (p. 10). Most studies that have investigated the ease and difficulty to traverse various land cover classes have done so by coding each raster cell with values ranging from 1 to 5 representing the speed (in km/h) to traverse that pixel (e.g., [19]). More convenient when using the least cost path analysis, is to represent movement across a pixel with a value representing the cost or impedance to traverse the area covered by that pixel [22].
To take advantage of the sophisticated travel time estimation for travel on foot that includes both vertical and horizontal impedance, we use in this article a raster-based approach to measure accessibility. This is also favourable as we want to combine spatial accessibility with the perceived accessibility.

Combining Spatial and Non-Spatial Measures of Accessibility
Several studies have used GIS-based measures on accessibility to health facilities [9,23] and that are innovative in various ways, such as investigating accessibility with respect to different age groups [24]. Other studies use survey-based measures on perceived accessibility [25], while the novelty of the study presented in this article, is to combine GISbased measures with the survey-based measure of accessibility. Literature reviews state that there has been little research concerned with the relationship or mismatch between measured and perceived access to health care services (e.g., [23]), but also that there is a growing recognition on the importance of both spatial and nonspatial factors [26]. Although still rare, there are some examples of approaches that combine spatial dimensions related to geographic access (distances, travel times, catchments, etc.) with research that considers socio-economic aspects of access related to cost, insurance provision, etc. [25]. Ryan et al. [27] did a spatial comparison from a sample of 128 individuals between perceived and measured accessibility to the train station between different modes (park and ride, bus and ride, and walk and ride) and between three different age groups, in Perth, Australia. Hawthorne's and Kwan's [28] satisfaction adjusted distance (SAD) used the conventional street network distance as a baseline but added or subtracted a factor calculated based on 65 individuals' responses from survey data on perceived accessibility for inhabitants in Columbus, Ohio (USA). While both these studies are based on small samples, Comber et al. [29] combined the analyses of public perception of service accessibility in Leicester, UK, from a large sample attitudes survey (n = 8530) with an analysis of geographic road distance to those services.

Accessibility Studies in a Sub-Saharan African (SSA) Context
Whereas there are numerous articles studying accessibility to health care facilities in European and North American countries, these are much rarer for Sub-Saharan African (SSA) countries [20]. Exceptions exist on how to identify areas with poor access to health facilities and to plan the location of new centres to treat malaria in Kenya [30], to document access to tuberculosis treatment [31] or HIV treatment [32] in South Africa. These early examples of GIS-based accessibility studies, as well as in [33], emphasize the problems of study access to health facilities in data scarce environments.
In SSA, the distance patients need to travel to receive health care is usually greater than in European and North American countries [34]. In addition, walking is the predominant form of transportation in rural Africa due to the lack of infrastructure and motorized transport services [30,35]. In addition, travels in SSA are often not just along established roads but also along any possible route between locations [36]. Accessibility studies should therefore always include vertical impedance (slope) [37], but it is largely absent from walking accessibility measures [21]. A digital elevation model (DEM) allows the incorporation of slope into the analysis, which is important since terrain steepness accelerate or impede the speed of walking [35]. A study from Niger, Blanford, Kumar, Luo, and MacEachren [36] included both horizontal and vertical impedance to assess pedestrians' accessibility to hospitals and health centres using USGS's GTOPO30 DEM with a 1 km resolution. Moreover, to estimate walking time to healthcare centres in Rwanda, two studies have used a DEM from the Shuttle Radar Topography Mission with a considerable finer resolution of 90 m [15,35]. An even finer DEM resolution obtained from the Aster GDEM with 30 m of spatial resolution was used to assess the walking accessibility to primary healthcare centres in Mozambique [38].

The Added Value of Our Approach
This article contributes to the literature on health care accessibility in six ways: (1) We examine accessibility for people in a low-income country that must walk long distances to obtain health care. Prior healthcare accessibility research in resource-poor settings has utilized Tobler's [39] hiking function, as we do, to measure geographic accessibility to health care centres in Mozambique [38] but where all travels are being restricted to main, secondary or tertiary roads. We measure accessibility, represented by walking time, using a sophisticated path analysis involving both horizontal and vertical impedance. (2) We measure pedestrian travel time using datasets with the currently finest resolution available. While SSA is often considered as a data scarce environment, our study also demonstrates that high resolution elevation data, land use data, and crowdsourced datasets (i.e., use of the OpenStreetMap) that are globally available make sophisticated access analysis possible in countries without a well-developed national spatial data infrastructure.
(3) Although studies on health care and health outcomes in Africa are not that limited as they were almost 20 years ago [40], a weakness in most of them is that they do not take people's perception of access into account. One exception is [41], who combined physical distance to the nearest immunization centre, with mothers' perceptions of distance as determinants of child immunization in Nigeria, where the perception of distance turned out to be a more robust determinant than actual distance. This highlights the need to combine people's perceptions of barriers to health care with more objective measures of accessibility to identify causes for poor access. (4) While several studies tend to emphasize that barriers to health care are linked to specific socio-economic characteristics of the individuals [42], our departure is that barriers to health care are also linked to individual vulnerability factors such as functional limitations. For a person with relatively good health, having to walk to get health care may not be an obstacle. However, for a person with disabilities, having to walk even a short distance could effectively deter access. Hence, this article considers the interaction between individual and contextual characteristics since individual factors of vulnerability may moderate or mediate the impact of physical barriers on access, and vice versa. (5) Our study is based on a utilization dataset and measures actual geographical accessibility based on a large sample of individual level data (n = 2221), and thus differs from common approaches that examine potential accessibility using aggregated information [3] or approaches that measure travel distances to the nearest health centre (e.g., [15,35]). (6) Our approach is that barriers to health access are best investigated using a combination of quantitative and qualitative research methods and that a qualitative fieldwork is needed to uncover the most important barriers to health access. A key contribution with this article is a research design for where such a fieldwork should be carried out.

Health Facilities
GPS coordinates for health facilities were obtained from the Ministry of Health, except for coordinates for private hospitals which were collected in the field by researchers from University of Malawi. The eight health facilities we used are all included in, and within 6 m distance from the corresponding items in the spatial database of health facilities managed by the public sector in Sub-Saharan Africa [43].

Generating a Composite Variable for Perceived Accessibility
We used 18 questions developed by a team of experts including four African countries responsible for a previous survey on health service accessibility performed in Malawi in 2011 and 2012 [44]. While the questions incorporate the five accessibility dimensions suggested by Penchansky and Thomas [45], they were not intended to test these dimensions, and are not formulated in the same way. Additionally, they include individual aspects of making the choice of accessing health care. The 18 items were subject to scale analysis and subsequent factor analysis, yielding Cronbach's alpha of 0.85 and support using a one-factor solution. The scores for households' responses to the 18 questions were therefore summed and stored into one composite variable called AccessSum (see Table 1). The data comprise 3526 respondents whose household-locations are geocoded, enabling us to map their distribution but almost 4% of the geocoded records are incorrect. These include, for instance, latitude and longitude values that are either missing or that have zero values, while some coordinate values are incorrectly located outside the Malawian borders. We went through each of these records and decided to exclude them. From the remaining 3393 records, we made a sub-set of those who report they walk to the health facilities, that is 65.49% leaving us with 2221 records for further analysis. Figure 1 shows the locations of these remaining records for households that use health services in four districts in Malawi: In the north (Rumphi), in the centre (Ntchisi), in the southwest (Blantyre), and in the southeast (Phalombe).

Generating a Variable for Measured Accessibility (Walking Time)
For the horizontal cost surface, we used the global dataset for 2010 at a 30 m resolution and with 10 land cover classes (Globeland30, available from http://www.globallandcover. com, accessed on 15 April 2017) produced by the National Geomatics Centre of China [46]. We integrated Globeland30 with a rasterization of the OSM (available from http://download. geofabrik.de/africa/malawi.html, accessed on 15 April 2017). Thus, any pixel position classified as the road from the OSM raster replaced a pixel from the global land cover dataset. Guided by previous studies expressing movements across land use categories [19,22,35,47,48], we reclassified the land cover classes to cost values. We assigned the code '1' for the integrated OSM pixels. As we wanted to reserve the road class with the lowest cost ('1'), we assigned the cost value '2' for grassland, bare land, and artificial surfaces as we considered these land use categories to be rather easy to traverse. Cultivated land, forest, and shrubland are typically considered to be harder to travers and we coded these land use categories with a cost value '6', and coded wetland with the cost value '10'. The 'water' category was given a score of '999' assuming people would not be willing to swim across a water body, but rather go around it. As there are no tundra or permanent snow and ice in Malawi, these land use categories were not applicable. We used the resulting five-class land cover raster (with values 1, 2, 6, 10, and 999) as the cost raster input to the path distance tool to model the isotropic friction across the landscape.  Vertical impedance represents the fact that movements downhill are easier compared to movements uphill. For input representing the vertical impedance, we used a DEM with a 30 m resolution from ALOS World 3D-30 m (AW3D30, accessible from http: //www.eorc.jaxa.jp/ALOS/en/aw3d30/, accessed on 15 April 2017) made available by the Japan Aerospace Exploration Agency in 2015 [49]. The elevation raster is used to calculate the slope values for each cell. These slope values are used to calculate the vertical impedance incurred when moving from one cell to another. To simulate the walking cost, we multiplied the slope values with a factor to make the representations of movements uphill, downhill, and on a flat surface more realistic. We used Tobler's hiking function [39] that predicts the human walking speed based on the slope. Tobler's original formula calculates walking on a flat terrain at approximately 5 km/h. The highest walking speed is achieved in gentle downhill slopes, with speeds gradually declining as slope values decrease or increase. Tripcevich [50] has converted the data provided by Tobler into a structure that GIS software packages accept as a 'vertical factor table'. The vertical factor table is a two-column table with degree slopes in column 1 and the appropriate vertical factor in column 2. Maximum walking speed occurs at a slight downhill (approximately 3 degrees) and is 6012 m per hour [51]. The vertical factor table is the input to the path distance function [52].
We performed the path analysis twice: First to model the travel time for all individuals from his/her household location to the health centres he/she is using, and thereafter to model the travel time back, using the ToblersTowards and ToblersAway tables provided by Tripcevich [50]. The results are two raster layers. We extracted cell values at the point locations for the households, summed the travel times to and back, and stored these values in a new variable called TotWalkTime (total walking time).

Regression and Residual Analysis
Having estimated the walking time for every individual in our sample, the next stage involves a regression analysis of the relationship between perceived difficulty in accessing health service and measured accessibility. We formulated a simple model of perceived accessibility as a function of respondents age (Age), their functional limitations (LimFunc, see Table 2), and measured accessibility (total walking time, TotWalkTime): This model is then estimated using ordinary least squares regression and a new variable containing the residual-value for each observation is created. The effects of the independent variables behaved as expected and were all significant at the p < 0.01 level, but the coefficient of determination (R 2 ) is only 0.038. Hence, this model is only able to explain a small portion of the variation in perceived accessibility and the potential for uncovering more important determinants through fieldwork should therefore be large. Table 2 lists the eight survey items used to construct the functional limitation (LimFunc) variable. The items in Table 2 are the standardized Washington Group Short Set for identifying activity limitations (1-6) with the addition of one item on mental limitations (7) and one item on limitations beyond age expectancies (8) [53]. It is known from the literature that higher levels of activity limitations increase barriers to accessing health care in lowincome contexts [44]. Activity limitation is a broad concept drawn from the International Classification of Functioning, Disability and Health (ICF) [54] capturing limitations in daily life activities. The Short Set with the additional two questions [55] is used in the analyses in this article to eliminate confounding effects of individual variation in functioning on access to health care. The eight survey questions asked about difficulties you may have doing certain activities due to a health problem or impairment, and the responses were scores on a four-point scale (1: no difficulty, 2: some difficulty, 3: a lot of difficulty, and 4: unable). The scores from the eight items were combined into a composite measure (the LimFunc variable) by adding the scores together with equal weights. By modelling a more complex relationship between perceived accessibility and measured accessibility, where parts of the variation in perceived accessibility can be attributed to other factors explicitly controlled for, one may narrow down the scope of the screening to areas where one does not possess relevant quantitative information or has no clear expectations about what other factors could explain perceived access. In other words, we apply a model that says that the score in perceived access is not simply dependent on the score of measured accessibility, but rather depends on other variables as well. The scope of the screening is thereby reduced to areas where the observations do not conform well to the model, i.e., where the residuals are clustered and large.

Local Spatial Statistics
To identify statistically significant spatial clusters of high residual values (hot spots) and/or low residual values (cold spots), we performed a hot spot analysis known as Getis-Ord Gi* statistics [56] commonly implemented in commercial GIS packages [57]. The Gi* statistics return z-scores, p-values, and Gi_Bin values for each point feature (i.e., household). There are seven possible Gi_Bin values which are integers ranging from −3 to +3. Significant hot spots have positive Gi_Bin values, high positive z-scores, and small p-values. Significant cold spots have negative Gi_Bin values, low negative z-scores, and small p-values. Features with z-scores close to zero (Gi_bin value 0) are not statistically significant [58].
The conceptualization of spatial relationships and the scale of analysis may influence the statistical significance of the spatial clustering of values. As the households are in four different districts distant from each other (see Figure 1), using the entire country as the scale of analysis would not be appropriate (thus we assume spatial independence between the distinct geographic areas). Therefore, we performed the hot spot analysis separately for the eight subsets of households using the eight different health clinics.

Survey Summary
With Likert scale scores ranging from 1 (no problem) to 5 (insurmountable problem) summed for the 18 variables, the possible values for the AccessSum-variable range from 18 to 90. As many as 12.65% of the respondents answered no problem to all 18 questions (and have value 18 on the AccessSum variable). None of the respondents answered insurmountable problem to all 18 questions. Values in the AccessSum-variable range from 18 to 73, have a median value of 26, and interquartile range of 11 (Q1 = 22 and Q3 = 33). Figure 2 shows the calculated path distance surface for the walking time to the Chimembe health centre in the southern part of Malawi. The household towards the north-eastern corner of the map is 6.12 km away from the health centre, but as measured along the road network and adjusting for terrain variation, the distance is estimated to be 9.14 km. north-eastern corner of the map is 6.12 km away from the health centre, but as measured along the road network and adjusting for terrain variation, the distance is estimated to be 9.14 km.

Figure 2.
Comparing the shortest path distance from household to health facility with straight-line distance and elevation profile of the example path (household marked with a blue square and Chimembe health centre with a red cross). Figure 2. Comparing the shortest path distance from household to health facility with straight-line distance and elevation profile of the example path (household marked with a blue square and Chimembe health centre with a red cross). Figure 2 also shows the profile of the example path. The highest point along this point is 871 m a.s.l. at the household location. The lowest location is at 651 m a.s.l. near the Chimembe health centre which is at an elevation of 670 m. The path has an average slope of 4.9 degrees and follows the road network except for about 250 m where one needs to cross the grassland (impedance = 2). Walking time along this path-from the household to the Chimembe health centre-is estimated using Tobler's hiking function to be 2 h and 16 min, and 2 h and 28 min going back. By summing these, we get the value of 4 h and 34 min or 4.57 h, which is the TotWalkTime-variable value stored for this individual. The unit for the TotWalkTime-variable is thus hours and the values range from 0.03 to 17.18, have a mean of 3.47, and standard deviation of 3.01.

Using Local Spatial Statistics to Identify Significant Clusters
There are many regions with household locations having either low or high residual values. These can be interesting sites, but the high residual values can also be a result of a random change. To be a significant hot spot, a household including its surrounding households must have similarly high values to qualify. To assist a decision as to which of the areas the fieldwork should go to, we look for significant clusters of high positive residual values and/or low negative residual values. Positive residuals mean that the households have a higher AccessSum score than the model prediction. They will, in other words, rate their access to health services as lower than the measured accessibility suggests. Negative residuals mean that the households have a lower AccessSum score than the modelled relationship suggests, which means that that these household members consider the health service accessibility to be better than the measured accessibility suggests. Table 3 shows the summarized numbers of households falling into the seven possible Gi_Bin categories. For two of the catchments (Chitekesa and Mwanga), there are no significant spatial clustering of neither low nor high residual values. Khuwi has no significant cold spots and Lura has no significant hot spots. There are significant cold spots and hot spots in five of the catchments, and four of the catchments have both significant cold spots and hot spots. From Table 3, one catchment is very different from the other, namely Chimembe having as much as 41% of the households in significant cold spots and significant hot spots. Table 3. Counts of households falling into the various Gi_Bins (percentage in brackets). Colours represent Gi_Bin values: blue colours are cold spots, red colours are hot spots, and the yellow colour represents insignificant clustering. The darker the blue/red colour the higher is the confidence coefficient.  Figure 3 shows a map over the Chimembe catchment area where the households are marked with their Gi_Bin score. Several of the households that are near the Chimembe health centre form a significant hot spot. These households have high positive residuals and are more dissatisfied with the health accessibility than what we would expect from their location (relative near the health centre). Several other households in the upper, right corner of the map, form a significant cold spot. These are households with high negative residuals, and which are more satisfied than we would expect from their location relatively far away from the health centre. Chimembe was selected as a fieldwork site, and the identified divergences were further explored in a qualitative fieldwork carried out in October and November 2017 [59]. (0) (11.5) (4.7) (73.8) (1.1) (9.0) (0) Figure 3 shows a map over the Chimembe catchment area where the households are marked with their Gi_Bin score. Several of the households that are near the Chimembe health centre form a significant hot spot. These households have high positive residuals and are more dissatisfied with the health accessibility than what we would expect from their location (relative near the health centre). Several other households in the upper, right corner of the map, form a significant cold spot. These are households with high negative residuals, and which are more satisfied than we would expect from their location relatively far away from the health centre. Chimembe was selected as a fieldwork site, and the identified divergences were further explored in a qualitative fieldwork carried out in October and November 2017 [59].

Discussion
In Malawi, a large part of the population needs to walk to get to health care. From our sample of persons walking to and from health services, the average total walking time was measured to be almost 3.5 h and with a standard deviation close to 3 h. This means that many health care seekers may need to spend most of their day walking when seeking health care. Nevertheless, 62% of our sample is measured to have good access to health care facilities (that is, being closer to a health service facility than the average travel time) However, for the remaining 38%, the measured access to health care facilities may indeed

Discussion
In Malawi, a large part of the population needs to walk to get to health care. From our sample of persons walking to and from health services, the average total walking time was measured to be almost 3.5 h and with a standard deviation close to 3 h. This means that many health care seekers may need to spend most of their day walking when seeking health care. Nevertheless, 62% of our sample is measured to have good access to health care facilities (that is, being closer to a health service facility than the average travel time). However, for the remaining 38%, the measured access to health care facilities may indeed be poor, with long distances and inaccessible roads and paths. Many respondents nevertheless perceived their difficulties in accessing health care to be small (median value for the AccessSum variable is 26 on a range from 18 to 90). Why?

Interpersonal Relationship
Results from the focus group discussion during fieldwork to Chimembe in 2017 attribute this perception to the good nature and welcoming attitude of the staff at the health facility, as well as the perceived appropriateness of the care received [59]. Indeed, even in a resource scarce society such as Malawi, it is the interpersonal relationships between patients and health service providers that have the largest impact on perceived difficulties in accessing health care [60].
We have measured the walking time to the health care facility that the patient states they are using, which may not necessarily be their designated one. The term 'bypassing' was briefly mentioned in the introduction as a term that describes the behaviour of patients who choose to travel beyond local health care facilities in favour of more distant and often more expensive ones [4]. 'Bypassing' usually indicates significant problems with the quality of care at the bypassed clinic or a considerably better service quality at the alternative facility [4]. Akin and Hutchinson [4] describe 'bypassing' as a widespread phenomenon in the developing world. However, information from the qualitative fieldwork indicates that bypassing is in fact happening also in Malawi and support fieldwork findings that distance is secondary to qualitative considerations in the perception of health care access [60].

Prospecting Being Developed as a Spatial Method within Archaeology
The current study demonstrates how a screening procedure may be designed to identify the most promising sites for qualitative fieldwork. While this is a rather common and necessary activity for archaeological fieldwork, it is uncommon and, in our opinion, an often-neglected issue within qualitative fieldworks. Using residual from regression to select sites for where to conduct fieldwork is an old idea from geography appearing, among other places, in Peter Haggett's book 'Locational analysis in human geography' [61]. This book was influential for both 'quantitative geography' and the 'new (quantitative) archaeology'. However, 'while quantitative archaeologists began celebrating their victory, geographers were conversely beginning to strongly criticize 'Quantitative Geography', which had provided a major source of inspiration to New Archaeology' [62] (p. 44). This may have been influential to why prospecting was developed within archaeology and not within geography, although it was inspired by locational theories and analyses developed by geographers in the 1960s.

Limitations
Evidently, our study has some limitations. First, inaccurate positional accuracy measures may be inherent for some of the household locations. GPS was used during the survey in 2011 and 2012 to capture the locations of participating households. GPS receivers require an unobstructed line of sight to four or more GPS satellites to determine an accurate location. Although even a low-cost GPS receiver could obtain relative accurate measurements in 2011, there are several ways that the measurements could be erroneous. Secondly, as we excluded records that were obviously erroneous to avoid bias in the measured walking distances, we could have introduced a bias in the AccessSum variable representing the perceived difficulties in accessing health care services. Therefore, we also calculated the AccessSum variable for the subset of individuals who walked to get to health services but were excluded due to erroneous coordinates (n = 88). The central tendency and distribution of values for the excluded subset were very similar to the sample we used, and we therefore consider the possible bias in the perceived accessibility to be small. A third limitation, and another accuracy issue, is that our analysis may be based on an incomplete dataset on roads and paths inhabitants may have used to get to and from the health care facilities. This was, however, also considered as a reason to use the raster-based path analysis since it allows any possible path between household locations and health care centres (thus paths are not restricted to existing roads). Fourth, although we multiplied the slope values with a factor to make the representations of walking time uphill, downhill, and on a flat surface more realistic, we have not taken into account that walking speed varies depending on age, health, and seasonal variations [36]. Fifth, the survey questions (Table 1) are essentially about the level of barriers to access, and we may therefore have overlooked relevant fieldwork sites since our screening is based on one set of parameters possibly excluding other relevant ones. However, this is indeed the purpose of our methodology. We identify the places where the model reveals the highest residual values since these are likely to be places having potential to uncover other social, cultural, and environmental factors from a qualitative fieldwork. Operationalizing new variables identified as important during the qualitative fieldwork carried out in 2017 did improve the statistical model. In addition, a new residual analysis may point us in the direction towards other fieldwork venues for a further improved understanding [60].

Conclusions
The major contribution of this article has been to identify favourable sites for qualitative fieldwork aimed to further increase our understanding of barriers to health care accessibility. One site was identified as particularly favourable for a case study and fieldwork was subsequently carried out in this location. The fieldwork indicated that the perception of access is influenced by the accommodation and appropriateness of services-thus the qualitative elements of patient care and the effective utilization of available resources. To improve the use of available resources an important policy intervention would be to strengthen patient-doctor relations, e.g., by emphasising such relations in the educational health curriculum.
We have shown that the 'perceived state of things' differs from the 'measured state of things' related to health care access in Malawi. A similar deviation may be found elsewhere and/or for other topics such as vulnerability or living conditions. Investigations that map the geography of vulnerability or living conditions often result in maps that rank municipalities, neighbourhoods, or other areas. People living in the most vulnerable municipalities or the neighbourhoods with lowest living conditions, may react negatively to the 'expert assessment' if it deviates much from their perceived level of vulnerability or living condition. The prospective screening that we used to identify favourable fieldwork sites did indeed lead to an increased understanding of barriers to health care in Malawi, and similar benefits may be accomplished elsewhere as well as for topics such as vulnerability and living conditions. We have represented the 'perceived state of things' with survey responses but this could be done in other ways if no survey is available or too cumbersome to arrange.

Informed Consent Statement:
The study obtained oral consent from all respondents.
Data Availability Statement: The processing was done using ArcGIS version 10.8. Input data, resulting dataset, as well as python scripts documenting the work progress are available from the Norwegian centre for research data (available from here: https://arkivering.nsd.no/ac1b0002-7710 -1ef1-8179-7f8a1d01012f, accessed on 25 July 2021).