1. Introduction
The introduction of the Huff model is of critical significance in urban transport, economics and business areas, which can help us understand the accessibility, business opportunities, source and distribution of customers, and give suggestions to the optimal location planning of new trading areas [
1,
2,
3,
4,
5]. Besides, there are already many methods to analyze trading areas, such as the Ring model [
6], regression model [
7,
8], the analog model [
9], the Huff model, and so on. Among all these methods, the Huff model is a quantitative method widely used to explore the interactions in urban environment [
10,
11]. Before applying the Huff model, the high-accuracy calibration of the model is a crucial procedure to apply [
12]. Previous studies have paid much attention to the calibration of the model [
13,
14,
15]. Traditionally, the calibration of the spatial interaction model in human mobility research is dependent on survey or questionnaire, which has a few disadvantages, such as being labor-intensive, time-consuming, and error-prone; usually having a poor response rate [
16,
17]; and, sometimes, lacking proper sampling mechanisms. Improper sampling methods may lead to some biases or non-representative issues [
18,
19,
20,
21,
22], thus may influence the calibration of spatial interaction model [
23,
24]. Besides, the number of sampling locations is also one of the concerns in many studies. For example, Zhou et al. [
25] investigates how many samples are needed for a good performance of road selection and finds that only a small number (e.g., 50–100) of training samples is needed, while Zhao et al. [
26] indicates that sparse sampled call detail records data introduce some biases to human mobility research. Thus, one of the main tasks of this paper is to investigate the effects of different numbers of sampling locations for calibrating the Huff model.
Recently, researchers prefer to use larger multi-source datasets to find their better solution. Fortunately, the advent of information and communication technology (ICT) aids the acquisition of human trajectory data by lowering the cost of collecting, storing, processing, and sharing data and information [
27,
28]. Large volume data (such as GPS tracking data, mobile phone location data, social media check-in data, and so on), give new insights and a better understanding of human mobility and behaviors [
29,
30]; community detection [
31,
32,
33]; urban activity space and dynamics [
34,
35]; and spatial interaction and modeling [
36,
37]. Regarding the calibration of spatial interaction model, most use all sampling locations to calibrate. For example, Yue et al. [
38] and Markham et al. [
39] use the whole datasets to calibrate spatial interaction models, but whether a small part of the datasets contributes to more accurate calibration results remains unresolved.
Besides, among all the big geodata, mobile phone location data are very special data because mobile phones have an extremely high penetration rate, which can be over 94% in Asian countries such as China [
40], and people usually take their cell phone with them. Thus, some researchers view this type of data as a reasonable source to describe human mobility and model spatial interactions [
29,
31,
33,
34,
35,
37], and many valuable findings regarding human dynamics in the era of big data have been explored from this kind of data. For example, Gao et al. [
31] propose an alternative modularity function which incorporates a calibrated gravity model to discover the clustering structures of spatial-interaction communities generated by massive mobile phone users. Liang et al. [
41] analyze the collective intra-urban mobility using a modified spatial interaction model, and Simini et al. [
42] propose a radiation model which predicts mobility patterns in good agreement with observed data when compared with the calibrated gravity model by using different data sources including mobile phone data. Whether the models were well calibrated to “best fit” the observed data needs to be answered before comparison and application. If so, another question is how to derive the more valuable sampling locations to get high-accuracy calibration results. Additionally, Vij et al. [
43] exhibit a neutral attitude towards the volume of big data, pointing out that high quality but small volume data may be better than big data, and small volume datasets represent not only dimension reduction but also noise elimination from big data.
The calibration methodology has been widely discussed [
44,
45,
46], and is not the main focus of this paper. However, we investigate the effects of sampling locations by calibrating a spatial interaction model as a case study, using mobile phone location data from the big data era, and attempt to answer the following questions:
- (1)
Does using all sampling locations always perform better than small volume of sampling locations to calibrate the Huff model?
- (2)
If not, what kinds of sampling locations are more effective for calibrating this model?
There are several contributions of this study. Firstly, the results of this paper show that small volume of sampling location dataset may perform more effective for the calibration of the Huff model than large sampling locations, which could help utilize big data better for human mobility modeling; Secondly, we propose a method to select the more effective locations from massive mobile phone towers to improve the model calibration, which could be used to guide surveys or questionnaire for trading area analysis in real scenes. This method could save both survey time and expense in many related areas of research while achieving high-quality model calibration results. To the best of our knowledge, this is the first work to discuss whether large sampling locations are always more effective for calibrating spatial interaction model and apply business area analysis to location data from mobile phone users, opening a new area for business applications.
The rest of this paper is organized as follows.
Section 2 introduces the mobile phone location dataset and the study area.
Section 3 describes the method to extract the trips to business areas and the strategy to calibrate the spatial interaction model.
Section 4 discusses the analysis results.
Section 5 summarizes our findings and discusses future research directions.
4. Results
4.1. Distribution of SSE
We calculated the distribution of
SSE under each calibration parameter, as shown in
Figure 5. The
SSE of all the 2621 cell phone towers can be quite different with different numbers of randomly selected cell phone towers.
Firstly, as the number of random cell towers grows, the value of SSE is less fluctuant and closer to 1205, which is the total sum of squared errors (TSSE) when all cell towers were used for calibration. In particular, when more than 900 cell phone towers were used, the SSE is between 1180 and 1220. As the number of towers increases, the interval of the SSE decreases.
Secondly, the fewer random cell towers used, the more the SSE fluctuates. When the number of cell phone towers is low (such as 150 cell phone towers), we can obtain both a better and a worse calibration result than when using all cell phone towers for calibration. For example, when using 30 cell towers for calibration, the SSE can fluctuate from approximately 1150 to 1318. When the random dataset is 30, the selected towers can provide both a better and a worse calibration result. Some cell phone towers may appear more than once in all of the random combinations. Later, we will investigate the general characteristics of these best-performing towers.
Most importantly, few random sampling locations have the ability to improve the calibration results compared to many random sampling locations. We can conclude that it is not always that more sampling locations lead to the better solutions for calibrating spatial interaction models. In other words, when we conduct surveys or questionnaires, the locations are very important; or when we use large location data for the calibration of the spatial interaction model, not all sampling locations are valuable for calibration. The fluctuation of SSE is greatest when using 30 towers for calibration. In the next section, we use a random sample of 30 towers to investigate the hidden patterns of these better performing calibrations due to the most fluctuant SSE distribution when using this random sample.
4.2. Finding Out Which Cell Phone Towers Best Fit Each Commercial Area
Previously, we assumed that there were some common characteristics between better performing towers. Firstly, this paper measured the similarity between the estimated percentage (
Tij) from a location to each commercial area and the observed percentage of trips (
Pij) towards each commercial area. The most similar pair is considered to belong to that commercial area. The similarity index (
SI) is measured by,
where
Tij is the estimated percentage of flows from tower
i to commercial area
j, and
Pij is the observed percentage of flows. After each cell phone tower is tagged with their best fit commercial area, we determine whether this tower is within the tagged commercial area’s Thiessen polygon (the Thiessen polygon is derived from the center of each commercial area). Due to a large number of random selections, some towers may be selected more than once. In this case, the maximum number of the best-fit commercial area is assigned as its tagged commercial area.
Finally, each cell phone tower is classified by its best fit commercial area, as shown in
Figure 6. Combined with its affiliated Thiessen polygon, we obtain the following statistical
Table 3. This table illustrates that, except for “D”, the other four commercial areas have a maximum percentages of best fit towers within their scope, especially for “R”, “H” and “N”, where the percentage of best-fit towers within their scope are 70.22%, 62.37%, and 61.35%, respectively.
However, the highest percentage of best fit towers is not always within the areas scope. For example, the percentage of best fit towers in “D” is only 23.61%, but 34.84% and 37.70% of towers in “D” are best fit for “R” and “H”, respectively, which are higher than “D” itself. At the same time, although the percentage of best fit towers within polygon “O” is the highest (28.58%), 61.42% of towers are best fit for other nonadjacent commercial areas.
For most cases (“R”, “H”, “O” and “N”), the highest percentage of best fit towers are within their polygon scope, which reveals that this characteristic of spatial adjacency can play a role when choosing the random sample. Next, we attempt to choose towers that best fit their adjacent commercial areas to further reveal the attributes (distance and flows) of these towers.
4.3. High-Accuracy Calibration by Using Spatial Adjacency
The distance in this paper was represented by the spatial adjacency [
58]. To determine whether the tower’s best fit commercial area is consistent with the tower’s most adjacent commercial area, this paper divides the 2621 cell phone towers into two clusters. The two clusters are as follows:
- (1)
Set A: The tower’s best fit commercial area is consistent with the tower’s most adjacent commercial area. This subset account for 45.64% of the 2621 cell phone towers, as shown in
Figure 7a.
- (2)
Set B: The tower’s best fit commercial area is not consistent with the tower’s most adjacent commercial area. This subset account for 54.36% of all the 2621 cell phone towers, as shown in
Figure 7b.
Therefore, Set A is a complementary set of Set B. Both of the sets consist of the 2621 cell phone towers. The spatial distributions of these two sets are shown in
Figure 7. It is evident that Set A and Set B are mixed in spatial distribution.
To investigates the different effects of the two sets on the calibration of the Huff model, multiples of 30 phone towers or its integer times (60, 90, etc.) are randomly selected from each set. Each multiple of 30 or its integer times of selected phone towers were randomly selected 500 times. Each time, the bias between observed
Pij and estimated
Tij of the total 2621 cell phone towers are estimated by
SSE. The distributions of the
SSE of the two sets are shown in
Figure 8. The average value of all
SSE and the percentage of times that
SSE fell below 1205 and above 1205 are calculated, as shown in
Table 4.
It is obvious from
Table 4 that the average of
SSE from Set A is 1189.3, which is lower than the
TSSE (1205). When using Set B to calibrate the Huff model, the average of
SSE is 1205.4, which is nearly equal to
TSSE (1205). Moreover, by using Set A, the percentage of random times that
SSE is better than
TSSE is 96.2%, which is significantly higher than when using Set B (with only 3.8% of random times better than
TSSE). Therefore, using Set A (the tower’s best fit commercial area is consistent with the tower’s most adjacent commercial area) can result in a more effective calibration.
Until now, how to directly distinguish this kind of dataset was still unknown. From a spatial distribution point of view, Set A and Set B are well mixed. Moreover, the percentage of cell phone towers in Set A and Set B are 45.64% and 54.36%, respectively, which are both near 50%. Thus, how to easily distinguish Set A from Set B needs to be resolved. To distinguish these two sets directly, the volume of flows of each set is calculated, as shown in
Table 5.
It is obvious from
Table 5 that the average number of flows from each cell phone tower in Set A is 370, which is much higher than the average number of flows from each cell phone tower in Set B, 150. Further, the percentage of cell phone towers with more than 150 trips in Set A is 31.9%, which is also much higher than the percentage of cell phone towers with more than 150 trips in Set B, which is only 19.7%. Thus, the volume of trips from each tower plays a major role in distinguishing the better performing towers from all of the 2621 cell phone towers. In the next section, we will investigate how the volume of trips affects the calibration results.
4.4. High-Accuracy Calibration by Volume of Attracted Tirps
4.4.1. Calibration by Using Top 30 Cell Phone Towers with Highest Trips
According to the previous experiments, we know that volume of trips from each cell phone tower is a criterion to distinguish the better performing towers from all the 2621 cell phone towers. In this section, we select the top 30 cell phone towers with the highest number trips to the five commercial areas. The spatial distribution of the top 30 cell phone towers is shown in
Figure 9. Then, the model parameters are calibrated by these 30 towers and
SSE is 1165.1, which is much lower than the
TSSE (using all the towers to calibrate the model).
Each cell phone tower has five different distances to the five commercial areas. We divide the urban space into multiple of 3 km according to the distance of cell phone towers to five commercial areas. Each cell phone tower may be within one commercial area’s 3 km buffer scope while also within another commercial area’s 6 km buffer scope. If a cell phone tower is located within at least one commercial area’s 3 km scope, we define it as is in the commercial area’s 3 km scope. Then, we calculate whether the bias of each cell phone tower is below (better than) average or above (worse than) the average bias. The average bias is the mean value of
SSE of the 2621 cell phone towers. As shown in
Table 6, by using the top 30 cell phone towers with the highest number of trips to calibrate the model, 76.8% of the 961 cell phone towers in the commercial area’s 3 km buffer scope will behave better than average and only 23.20% of cell phone towers behave worse than average. For the 742 cell phone towers in the 3 to 6 km scope, 52.07% perform better than average. When the buffer scope is over 9 km, more than 81% of cell phone towers in that scope behave worse than average, but the total number of towers within that scope is much less than within 6 km.
Selecting cell phone towers with a large volume of trips for calibration can significantly benefit the model when towers are located within 6 km. In the text section, we will verify the effects on calibration of cell phone towers with a different volume of trips.
4.4.2. Calibration by Using Selected Towers with Higher Volume of Flows
From the previous experiments, we can conclude that the effects of spatial proximity are reflected by flows. The best fit towers within the areas polygon scope have higher flows and perform better. In particular, the top 30 towers with highest number of trips also behave better. Thus, we use different volumes of flow to test the effects of flows on the parameters calibration.
We select cell phone towers with more than 10 trips in multiples of 10. The distribution of the percentage of cell phone towers with specified lower bounds of trips is shown in
Figure 10. As the lower bound of trips increases, cell phone towers with a small number of trips are gradually excluded. In each case, we randomly select 30 cell phone towers 500 times to calibrate the Huff model. Then, the calibrated parameters are used to calculate the
SSE for the total 2621 cell phone towers. Each
SSE is compared with the
TSSE. Finally, we determine the percentage of random times where the
SSE is lower than
TSSE, as shown in
Figure 11. The horizontal axis represents the low bound of trips, namely, the trips of selected towers that are higher than the specified value. The vertical axis represents the percentage of times where the
SSE is lower than
TSSE for all 500 random selections. The maximum of the low bound of trips is set to 500 because there are only 10% of towers with more than 500 trips.
From
Figure 11, it is clear that the percentage of times where the
SSE is lower than
TSSE changes significantly from 76% to 95% when the low bound of trips is increased from 10 to 70. Particularly, when the low bound of trips is higher than 70, the percentage of times where the
SSE is lower than
TSSE is steadily greater than 95%. This result indicates that the probability of obtaining better results is greater when using a large volume of trips from cell phone towers to calibrate the spatial interaction model. However, the question, what are the effects of the calibrated parameters on small cell phone towers when using big volume trips of cell phone towers, remains. In the next section, we will verify this effect.
4.4.3. Effects on Towers with “Small” Volume of Trips
From the experiment above, the probability of obtaining better results is higher when using a large volume of trips from cell phone towers to calibrate the spatial interaction model. However, the SSE is the overall measurement of the bias between the observed and estimated probability. When we choose the high volume of trips to calibrate the model, the effect on the small volume of trips is ignored. It may be that the overall better result is built at the expense of the small volume of trips. In this part, we select the towers with more than 10 trips as the whole candidate set to evaluate the SSES of towers with less than 70 trips. The number of towers with less than 70 trips is 1641. Thus, the SSES is the estimated and observed probability bias of these 1641 cell phone towers. We consider these 1641 towers as ones with a small volume of trips because when the low bound of trips is higher than 70, the percentage of times where the SSE is lower than TSSE is steadily greater than 95%.
Thus, we get the distributions of
SSES, as shown in
Figure 12. The horizontal axis represents the low bound of trips, namely, the trips of selected towers are higher than the specified value. The vertical axis represents the
SSES in all 500 random selections. The maximum low bound of trips is also set as 500.
It is evident from the figure above that SSES is steadily distributed between 297 and 348 no matter whether the low bound of trips is greater than 10 or greater than 500. At each low bound, the SSES maintains a similar interval (the interval length is approximately 51). Thus, when choosing the large volume of trips to calibrate the model, the bias of the small volume of trips evaluated by SSE is not affected; that is, the concern that the small number of trips from cell phone towers may be sacrificed to get overall best results can be eliminated. Finally, using high volume trips from cell phone towers to calibrate the spatial interaction model is a good choice not only for obtaining better results but also for reducing computational demand.
5. Conclusions
Advancements in information and communication technology over the past two decades have produced massive and various kinds of big location data, which encourages novel insights for studies of human travel and activity patterns and other perspectives of research. However, “are large volume of sampling locations effective for calibrating spatial interaction model” is still a question for mobility research. This paper attempts to answer this question in the perspective of Huff model calibration, by using massive mobile phone location data, and some conclusions can be drawn as follows.
On the one hand, for the calibration of the Huff model, it is not “the more sampling locations are, the better calibration result is”. When we take all the cell phone towers into calibration, the SSE is not the lowest. Moreover, the fewer random cell towers, the more fluctuant the SSE. However, small random sampling sizes have the ability to improve calibration results than large random samples. In the calibration of the spatial interaction model, too much sampling locations may be just as bad as too little. Some special locations hidden in the large location data are more urgent and should be used and analyzed to provide some new insights into data science.
On the other hand, when we examined the characteristics of these better performing towers, the towers that are a best fit to their adjacent commercial area are good choices, which illustrates that spatial proximity plays a role when selecting the random sample. Besides, cell phone towers with this characteristic have a larger volume of trips than the other towers. Thus, the volume of flows from cell phone towers is the measurement to distinguish the valuable locations from the poorly performing locations. When we randomly selected 30 towers with more than 70 trips, the percentage of times where the SSE is lower than TSSE is steadily higher than 95%. Moreover, when choosing the big volume of trips to calibrate the model, the bias of small volume of trips evaluated by SSE is not affected, that is, the concern that the small trips of cell phone towers may be sacrificed to get overall best results can be eliminated. Thus, using sampling locations with high volume trips to calibrate the spatial interaction model is a good choice not only for obtaining better results but also for reducing computational demand.
However, we do note several limitations and challenges of this research, such as:
- (1)
In this paper, we adopted the Huff model to define business area, and only used size to represent the attractiveness. This simplification created a mismatch between the predicted attracted areas and the observed data. Other factors such as the number of POIs, parking conditions, price level and types of companies, malls in business areas may also influence the attractiveness. In the future, additional research is needed to identify the detailed attractiveness factors and a proper spatial interaction model to better depict the relationships.
- (2)
Another limitation is that we have not noted the social characteristics of these better performing locations. The combinations of other factors, such as resident distribution, income, land use type and so on, may reveal the social aspects of these better performing locations, which can provide better guidance to surveying or sampling.
- (3)
In this paper, we investigated the effects of sampling locations on the calibration of spatial interaction model between urban environment and commercial areas. However, our findings may or may not be applicable to other land use types due to the reason that different land use patterns also play a role in the model calibration.
- (4)
There may be some uncertainties in the extraction of origins/destinations from mobile phone data. It is possible that the “origins” used in this paper were just some passing-by locations, due to the reason that the footprints of mobile phone subscribers were sparsely sampled in space and time [
50], so it is hard to limit the “origin” as a “stay” where the subscribers have spent a certain time duration. In the future, dataset like GPS tracking data could be used to reduce the potential uncertainty in extracting the origins or destinations.