A GIS-Based Bivariate Logistic Regression Model for the Site-Suitability Analysis of Parcel-Pickup Lockers: A Case Study of Guangzhou, China

The site-suitability analysis (SSA) of parcel-pickup lockers (PPLs) is becoming a critical problem in last-mile logistics. Most studies have focused on the site-selection problem to identify the best site from given potential sites in specific areas, while few have solved the site-search problem to determine the boundary of the suitable area. A GIS-based bivariate logistic regression (LR) model using the supervised machine-learning (ML) algorithm was developed for suitability classification in this study. Eight crucial factors were selected from 27 candidate variables using stepwise methods with a training dataset in the best LR model. The variable of the proximity to residential buildings was more important than that to various commercial buildings, transport services, and roads. Among the four types of residential buildings, the most crucial factor was the proximity to residential quarters. A test dataset was employed for the validation process, showing that the best LR model had excellent performance. The results identified the suitable areas for PPLs, accounting for 8% of the total area of Guangzhou (GZ). A decision-maker can focus on these suitable areas as the site-selection ranges for PPLs, which significantly reduces the difficulty of analysis and time costs. This method can quickly decompose a large-scale area into several small-scale suitable areas, with relevance to the problem of selecting sites from various candidate sites.


Introduction
The rapid development of e-commerce has severely impacted parcel distribution, and the last-mile delivery problem restricts logistics development. Many e-commerce companies, logistics service providers, and other stakeholders considered effective systems for last-mile delivery to be essential competitive advantages and attempted to tackle the bottleneck by innovative methods, such as parcel-pickup points (PPPs, also called collection and delivery points), drone delivery, and autonomous ground vehicle delivery [1][2][3][4]. PPP is the most widely used novel solution that helps firms reduce costs through consolidated shipments and provide customers with a flexible, convenient, and comfortable means of receiving parcels.
PPPs have garnered significant interest in logistics research. Studies address the advantages of PPPs such as economic efficiency, environmental friendliness, and high service quality [5][6][7][8]. There are two types of PPPs: parcel-pickup shops (PPSs) and parcelpickup lockers (PPLs). PPLs rely on intelligent technology without human interaction, whereas PPSs cooperate with commercial facilities. PPLs exhibit the advantages of long opening hours, flexible collection times, and anonymity. Consumers are allowed to collect their parcels without being bound to shop opening hours. In addition, parcels can be retrieved anonymously because no human interaction is required [9,10]. Given that PPSs

Literature Review
The three core issues related to the location analysis for PPPs in previous studies are (1) influencing factors, (2) spatial distribution patterns, and (3) site selection. For the influencing factors, some studies state that the distribution of PPPs is strongly related to the population density, land-use types, urban development, and spatial accessibility according to their agglomeration pattern [18][19][20][21]. Some studies have found that residents' behavior also has a relationship with PPP layout, and thus developed methods for measuring customers' spatial access to PPPs, considering differentiated supply and demand [22]. For spatial distribution patterns, the patterns of PPSs in several cities of China (Changsha, Wuhan, and Xi'an) were investigated using point of interest (POI) data [21,23,24]. The results showed that there are more PPSs in the central regions and fewer in the periphery regions, and there are multi-core agglomerations in general. For the site selection, research determined the best sites by ranking or rating candidates based on different indicators [16,17]. In general, previous studies related to location analysis for PPPs only analyzed the location characteristics and impact factors. Few studies addressed the sitesearch problem in SSA to identify the boundaries of the suitable sites in a large-scale area, such as a metropolis.
GIS-based SSA techniques are widely applied in urban, regional, and environmental planning activities, such as labeling potential hazards, ecological resources, habitats, and geological favorability, or locating advantageous sites for facilities, agricultural activities, and urban development [15,[25][26][27][28][29]. The challenging aspect of GIS-based SSA is determining the important factors and their weights. Three major groups of approaches to GIS-based SSA are computer-assisted overlay mapping, multi-criteria evaluation (MCE), and ML algorithms [15]. However, a criticism of the computer-assisted overlap map approach is that it is often used without verifying independent assumptions regarding the suitability criteria, nor is it standardized using appropriate methods [30]. In the MCE approach, the weights of the suitability criteria are determined subjectively, which is imprecise and ambiguous. Different multi-criteria evaluation rules generate remarkably different suitability patterns [15,31]. As a new data-driven technique, ML could overcome the limitations of the aforementioned approaches and better address problems involving enormous datasets. There are two types of models for the ML algorithm: white-box models are the explainable-type modes that allow an interpretation of the model parameters; black-box models, such as support vector machines or artificial neural networks, do not allow such an interpretation and can only be verified externally [32]. The LR model is the most common and useful white-box model for supervised classification algorithms due to its easy and efficient operation. The data types of the variables can be continuous or categorical. The result of the LR model is measured as a probability from 0 to 1, which can be considered as the suitability index. Thus, the large-scale area in this study was subdivided into a micro-scale raster to form the basic units of observation, and the classification of each raster was conducted according to its suitability index.

Study Area and Data
China has the largest e-commerce market globally, with over 40% of global e-commerce transactions originating from the country as of 2017. Guangzhou (GZ) is one of the four most developed metropolises in China, where PPLs occupy the market in the early stage. Furthermore, GZ has been ranked first for parcel receipts in China for seven consecutive years, from 2014 to 2020 [33]. As shown in In this study, the suitability modeling for PPLs was conducted using five types of data: POI data, road-network data, population data with a resolution of 100 m, landprice data, and a digital elevation model (DEM) with a resolution of 30 m, as shown in Table 1. Given the large quantity and wide distribution of PPLs and the related facilities, manual data acquisition was time-consuming and inaccurate, hindering the progress of PPL research. POI data-a novel form of data incorporating information such as latitudinal and longitudinal coordinates, specific locations, place names, and other attribute informationplayed an essential role in the analysis of macro-scale spatial distribution characteristics. POI data had the advantages of comprehensive coverage, high recognition accuracy, and high accessibility. Thus, POI big data improved the quality of micro-scale studies on PPL locations. In this study, POI data were obtained from Gaode Map, which was an everyday navigation application popular in China. It used three-level classification codes to classify objects of POI data. From the open application programming interface (API) of Gaode Map, developers could extract data for a specific area, a specific category, or a keyword for the name. According to the literature, PPL distributions were strongly related to traffic convenience and residential and commercial areas. The influential factors from the POI data were chosen from two major categories with several subcategories: transportation service and commercial/house, as shown in Table 2. The locations of PPLs were searched for using the keywords 'parcel locker' or 'self-pickup locker'. A total of 679 PPLs were extracted from Gaode Map in 2019. The road network data were collected from OpenStreetMap (OSM).  In this study, the suitability modeling for PPLs was conducted using five types of data: POI data, road-network data, population data with a resolution of 100 m, land-price data, and a digital elevation model (DEM) with a resolution of 30 m, as shown in Table 1. Given the large quantity and wide distribution of PPLs and the related facilities, manual data acquisition was time-consuming and inaccurate, hindering the progress of PPL research. POI data-a novel form of data incorporating information such as latitudinal and  Figure 2 shows the methodology used in this research. It mainly consisted of five parts: (1) the conversion of multi-source data to the same scale, (2) the preparation of the observation data, (3) the diagnosis of the assumptions of the LR model, (4) the determination of the best combination of explanatory variables, (5) the evaluation of the model's performance, and (6) the generation of the suitability map using the best model.

Methodology
Variables X1 to X27 are explained in Table 3, and their distribution maps are shown in Figure 3.   Variables X1 to X27 are explained in Table 3, and their distribution maps are shown in Figure 3.   Kernel density of residential building Dens_ResB X27

Methodology
Kernel density of road Dens_Road

Conversion of the Multi-Source Data to the Same Scale
The challenges associated with multi-source data were attributable to the different types and scales of the data. The multi-source data should be unified to the same type and unit in the preprocessing stage. This study used four different data types-vector-line, vector-point, vector-polygon, and raster data-with different resolutions. As this study aimed to identify suitable areas at the pixel level, all the data needed to be converted to the same data type (raster) with the same resolution. The vector-line and point data were converted using the Euclidean distance and kernel density method. The vector-polygon data were directly converted to raster data. Higher-resolution raster data were converted to a lower resolution using the resampling tool of the ArcGIS 10.6 software. A total of 27 conversion results with a resolution of 100 m were candidate variables in the modeling, as shown in Table 3 and Figure 3.

Preparation of the Observation Data
An observation database was prepared for the LR model to learn the data features, including suitable and unsuitable location points, with the values of their explanatory variables. The location points of PPLs were collected from the POI data from Gaode Map.

Conversion of the Multi-Source Data to the Same Scale
The challenges associated with multi-source data were attributable to the different types and scales of the data. The multi-source data should be unified to the same type and unit in the preprocessing stage. This study used four different data types-vector-line, vector-point, vector-polygon, and raster data-with different resolutions. As this study aimed to identify suitable areas at the pixel level, all the data needed to be converted to the same data type (raster) with the same resolution. The vector-line and point data were converted using the Euclidean distance and kernel density method. The vector-polygon data were directly converted to raster data. Higher-resolution raster data were converted to a lower resolution using the resampling tool of the ArcGIS 10.6 software. A total of 27 conversion results with a resolution of 100 m were candidate variables in the modeling, as shown in Table 3 and Figure 3.

Preparation of the Observation Data
An observation database was prepared for the LR model to learn the data features, including suitable and unsuitable location points, with the values of their explanatory variables. The location points of PPLs were collected from the POI data from Gaode Map. This study assumed that ranges of 500 m around the existing locations of PPLs were suitable (approximate walking distance of 5 min) [17]. After erasing the water and assumed suitable areas, the non-PPL points were randomly sampled in the remained area. The classification by the LR model using ML algorithms should have avoided the classimbalance problem [34]. In order to make the sample sizes of the positive and negative datasets similar, 690 non-PPL points were randomly selected. Figure 4 shows the locations of all the observation points.
This study assumed that ranges of 500 m around the existing locations of PPLs were suitable (approximate walking distance of 5 min) [17]. After erasing the water and assumed suitable areas, the non-PPL points were randomly sampled in the remained area. The classification by the LR model using ML algorithms should have avoided the class-imbalance problem [34]. In order to make the sample sizes of the positive and negative datasets similar, 690 non-PPL points were randomly selected. Figure 4 shows the locations of all the observation points.
Next, the values of all the observation points were extracted from the raster layers of 27 candidate variables to create the reference database. There were several points that extracted the null values from the raster layers. These abnormal points were neglected to reduce the model bias. Empirical studies showed that the best results were obtained by training and testing data with a ratio of 70:30 or 80:20 [35]. In order to employ more data to test the performance of the model, this study chose the ratio of 70:30. The data were randomly split into a training dataset and a test dataset.  Next, the values of all the observation points were extracted from the raster layers of 27 candidate variables to create the reference database. There were several points that extracted the null values from the raster layers. These abnormal points were neglected to reduce the model bias. Empirical studies showed that the best results were obtained by training and testing data with a ratio of 70:30 or 80:20 [35]. In order to employ more data to test the performance of the model, this study chose the ratio of 70:30. The data were randomly split into a training dataset and a test dataset.

Diagnosis of the Assumptions of LR Model
Before applying the LR model, it was necessary to examine the assumptions shown in Table 4. The data for modeling satisfied the requirements for the first four assumptions during the dataset design, but the last three had to be examined using other methods. Here, the diagnosis was conducted using Version 25 of the IBM SPSS statistics software. There are no obvious outliers. ?
Note: In the examination column, 'Y' indicates that the assumption met the requirement and '?' indicates that the assumption needed to be verified. •

Diagnosis of the linearity of independent variables and log-odds
The Box-Tidwell method was employed here. It incorporated the interaction term between the continuous independent variable and its natural logarithmic value into the regression equation [36]. First, the natural logarithms of all the continuous independent variables were calculated using the compute variable function in SPSS. Then, the interaction terms between the continuous independent variables and their logs were included in the binary LR analysis using SPSS. The statistical significance of this predictor suggested a nonlinear logit. When the interaction term was statistically significant (p-value < 0.05), there was no linear relationship between the corresponding continuous independent variable and the logit conversion value of the dependent variable. It was recommended that all the items in the analysis (including the intercept term) be corrected using the Bonferroni method when testing the multiple significance of the linearity hypothesis [37]. In this study, 55 items were included in the model analysis: 27 continuous independent variables, 27 interaction terms with their independent variables and their natural logs, and the intercept term (constant). A p-value less than the corrected value (i.e., 0.05 ÷ 55 = 0.000091) was taken to indicate nonlinearity. There was no observed p-value less than the corrected value. Hence, linear relationships existed between all the continuous independent variables and the log-conversion value of the dependent variable.

•
Diagnosis of multicollinearity A good LR model exhibits low noise and is statistically robust. It means that the explanatory variables are highly correlated with the dependent variable but minimally correlated with each other [38]. Multicollinearity occurred when explanatory variables exhibited strong correlations or associations with each other. When the degree of correlation was extremely high, the standard errors of the coefficients increased, which caused some variables to appear statistically insignificant in the results, even though they were significant. Multicollinearity made the coefficients unstable [39] and reduced the precision or interfered with the result when fitting the model [40]. This was mainly detected with the help of the tolerance (Tol) and reciprocal, called the variance inflation factor (VIF) [41]. The formulae are defined as follows: where R 2 is the coefficient of determination for the regression of the explanatory variable on all the remaining independent variables. VIF > 10 and Tol < 0.1 were common thresholds for assessing multicollinearity between explanatory variables [38,42]. There were several ways to address the multicollinearity problem. First, multiple variables with collinearity could be combined into a single variable. Second, the sample size could be increased to decrease standard errors. Third, some variables causing multicollinearity could be omitted from the model. Omitting some variables was the most direct, simple, and effective way. In order to retain as many variables as possible, the most correlated variable was neglected each time until the collinearity problem was not severe. Table 5 shows the VIF values of all the variables after omitting the variable with multicollinearity in the model. Table 5. VIF values of all variables after omitting the variable with the multicollinearity problem.

Diagnosis of obvious outliers
An outlier is an exceptional value that is very different from the others in a dataset. The LR model is sensitive to outliers. The usual approach to detecting outliers is based on the values of standardized residuals. If its absolute value is larger than three, it is usually considered an outlier [36]. After deleting the outliers, model fitting was conducted for the training dataset of 961 samples.

Determination of the Best Model Using the Stepwise Methods
There were many candidate variables in the model. It was important to detect the best variable combination for model fitting. A good model should adequately fit the data, and the predictor variables should not be too complicated. It was challenging to select the smallest number of candidate variables that could predict the dependent variable sufficiently while considering sample size constraints [36]. The forward and backward stepwise methods were frequently applied in previous studies of the LR model [43].
The forward stepwise selection method (FSSM) selected several significant predictors for the final model. Model optimization was performed using the least-squares criteria. It started with a blank model with no predictors. Variables were sequentially added one at a time to an empty model to predict the best output variable. Subsequently, a second variable that could best improve the model fitting was sought. The process was continued until a stopping rule was satisfied. In FSSM, variables added early in the process could be removed at a later stage because they became unimportant when other variables were added to the model. FSSM used a systematic method for adding variables based on their statistical significance in a regression. The process started with no explanatory variables in the model and then compared the incremental explanatory power of larger models [44]. Using the FSSM technique, the variables could be ranked by importance according to the priority of the added variables.
Unlike FSSM, the backward stepwise elimination method (BSEM) started with all the predictors of the least-squares model and then eliminated the least effective predictors one at a time. This method was continued until a stopping rule was satisfied. In the literature, the recommended stopping rule was a p-value of~0.15 [45,46]. In the SPSS software, the default values for FSSM and BSEM were 0.05 and 0.1, respectively.

Evaluation of the Model's Performance
The performance of the LR models was evaluated based on their discrimination and calibration. Discrimination referred to the ability of the model to correctly distinguish between the two suitability classes based on prediction values. The capacity of discrimination was often measured using a confusion matrix and by calculating indices of classification performance [47]. The LR model used the logistic function to map the predictions to probabilities between 0 and 1. The default threshold of 0.5 was commonly used. It assumed that a PPL was present if the probability was above 0.5; otherwise, it was absent. The classification accuracy was determined by comparing the predictions with the real values. The classification table was divided into four types. True positives (TPs) and true negatives (TNs) indicated the number of correctly predicted PPLs and non-PPLs; false positives (FPs) and false negatives (FNs) denoted the numbers of incorrect predictions. Several further indications were used to measure the performance of a model or predictors. The accuracy was the total number of correct predictions divided by the total number of predictions made for a dataset. However, even unskillful models could show high accuracy scores when the class imbalance was severe. An alternative to using the classification accuracy was to use precision and recall. Unfortunately, precision and recall may sometimes contradict each other. The F-Measure (also known as the F-Score) was the most common method for balancing both indications in a single score. The mathematical basis was the same as in Equations The discrimination only compared the predicted probability value with a certain threshold of 0.5. However, it ignored how far the predicted value was from the true value. Calibration resolved this shortcoming, and it described how close the predicted value was to the actual value. The Brier score was an important calibration index that measured the accuracy of probabilistic predictions. It was applicable to tasks in which predictions assigned probabilities to a set of mutually exclusive discrete outcomes. The set of possible outcomes could be either binary or categorical in nature, and the probabilities assigned to this set of outcomes must have summed to 1, where each individual probability ranged from 0 to 1 [48]. The lower the Brier score for a set of predictions, the better the predictions were calibrated. In this study, the reduction ratio for the variables involved in modeling (the model optimization rate) was added to evaluate the model's performance: where x is the real dependent variable, and q is the predicted probability. The receiver operating characteristic (ROC) curve was also a popular method for testing a model's accuracy and describing the quality of a probabilistic prediction system [49]. The area under the ROC curve (AUC) was a common metric for the level of discriminative ability; the larger the area, the better the performance of the model. The following classification using the AUC was considered for accuracy: 0.90-1 (excellent), 0.80-0.90 (good), 0.70-0.80 (fair), 0.60-0.70 (poor), and 0.50-0.60 (fail) [50,51].

Generation of the Suitability Map
The coefficient of the selected optimum variables and the constant of the best LR model was substituted into Equation (9). The suitability index of Equation (10) was applied in each raster of the whole study area for prediction. According to the classification threshold of the LR model, the suitability map for PPLs consisted of two categories. The raster with a predicted value between 0.5 and 1 was reclassified as a suitable area, and the raster with a value between 0 and 0.5 was reclassified as an unsuitable area. Table 6 shows the model's performance with the combination of variables selected by the FSSM and BSEM. The discrimination and calibration of the BSEM are also slightly better than those of the FSSM. However, the optimization rate for the FSSM is 20% higher. It indicates that the two methods for selecting the optimal variable combination show a similar model accuracy and bias. In terms of the index of model optimization, the FSSM performed better than the BSEM. Table 7 shows the coefficient of the best explanatory variable combination as determined by the BSEM. The Wald value indicates the significance of the variables. Eight significant variables were selected from the 25 variables without multicollinearity. Among these eight variables, five were selected from the accessibility factors, and one each was selected from the social factors, topographic factors, and urban development factors. Among the five selected accessibility factors, three were from the variables of proximity to various types of buildings. According to the Wald value, the most crucial factor was Dist_Res_Qua, with a value of 45.5, followed by SLPrice (29), Dist_BusStop (28.4), and Dens_ComBs (20.7). According to the signs of the coefficients, the variables of Dist_Res_Quar, Dist_BusStop, Dist_Com_OffB, Dist_Road_Sec, Dist_Res_Vil, and SLPrice were negatively correlated with the suitability for PPLs in the raster unit. The DEM and Dens_ComB were positively correlated. Thus, a PPL site may be situated close to residential quarters, commercial offices, and residential villas. The areas were near bus stops or secondary roads with relatively low land prices, and in high-density zones of commercial buildings.

Evaluation of the Classification Performance
The test dataset was used to conduct an unbiased evaluation of the final model's fit on the training dataset. The final LR model with the best variable combination and coefficients was applied to the test dataset. The F-measure, Brier score, and AUC were the indicators used to evaluate the model's classification performance, as shown in Table 8. The larger the F-measure index, the higher the discrimination accuracy of the model's classification. The F-Measure values for both the training and test data, were all greater than 89%. The lower the Brier score, the smaller the deviation predicted and the higher the calibration degree of the model. The Brier scores were less than 0.09. The value of the AUC for both datasets was between 0.9 and 1, indicating excellent accuracy. Overall, the predicted performance of the final LR model was effective. Additionally, the performance with the test dataset was better than that with the training dataset. Figure 5 demonstrates the suitability for PPLs simulated using the best LR model. The suitability for PPLs is divided into two classes: the suitable area in orange and the unsuitable area in blue. Most of the suitable areas are concentrated in the central districts and dispersed in small areas in the outer districts. Figure 6 summarizes the sizes and percentages of the suitable area by the district. Panyu district has the greatest suitable area, while Liwan district has the smallest. Yuexiu district has the greatest proportion of suitable area, more than 80%, while Conghua has the smallest, only 1%. Overall, the suitable area is appropriately 614 sq. km, accounting for 8% of the total area of GZ. The site-selection range for PPLs can focus on these suitable areas, which significantly reduces the difficulty of analysis and time costs.

Discussion
Big data make location analysis in a macro-scale area possible. POI data, an inno tive data source with a low cost, can identify the existing locations of PPLs and other lated facilities. Some studies used POI data to analyze the PPL distribution patterns specific cities of China and found them to be strongly consistent with economic devel ment levels, population density, and traffic convenience [21,23,24]. This study further veloped a GIS-based LR classification model using an ML algorithm to identify suita areas from bottom to top with massive, detailed data, which was different from previ studies conducted by the MCE approach. The optimum explanatory variables from the candidates and their coefficients for LR models were determined using a training data with stepwise methods. The FSSM performed better than the BSEM in the optimization variables. The most crucial variable was Dist_Res_Qua. It was much more important th the variables of the distance to various transport services/roads and the density of rela points. This result was consistent with the preferences of customers for PPLs being loca near their home addresses [52]. Furthermore, this study subdivided residential buildi into four types as candidate variables to analyze the relationships with PPLs. The res showed that the type of residential quarters was the most crucial variable; the type dormitory and community center (CC) were not determining variables for the locati of PPLs. A CC is a place providing recreational, cultural, and social activities for surrou ing groups of residential neighborhoods. Although a CC is usually near the residen building of the community, it is difficult to combine the behavior of picking up parc 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Discussion
Big data make location analysis in a macro-scale area possible. POI data, an innovative data source with a low cost, can identify the existing locations of PPLs and other related facilities. Some studies used POI data to analyze the PPL distribution patterns in specific cities of China and found them to be strongly consistent with economic development levels, population density, and traffic convenience [21,23,24]. This study further developed a GISbased LR classification model using an ML algorithm to identify suitable areas from bottom to top with massive, detailed data, which was different from previous studies conducted by the MCE approach. The optimum explanatory variables from the 27 candidates and their coefficients for LR models were determined using a training dataset with stepwise methods. The FSSM performed better than the BSEM in the optimization of variables. The most crucial variable was Dist_Res_Qua. It was much more important than the variables of the distance to various transport services/roads and the density of related points. This result was consistent with the preferences of customers for PPLs being located near their home addresses [52]. Furthermore, this study subdivided residential buildings into four types as candidate variables to analyze the relationships with PPLs. The results showed that the type of residential quarters was the most crucial variable; the types of dormitory and community center (CC) were not determining variables for the locations of PPLs. A CC is a place providing recreational, cultural, and social activities for surrounding groups of residential neighborhoods. Although a CC is usually near the residential building of the community, it is difficult to combine the behavior of picking up parcels with entertainment or social activities. The residential buildings of dormitories are mainly located in colleges, factories, or institutions with closed management. The dormitory areas are usually far from the entrance. It takes a long time to distribute parcels to PPLs near a dormitory building, and the delivery vehicles have limited accessibility. Due to the safety of internal personnel and the long delivery times, parcels for dormitories are generally signed for and stored by guards or shops which offer parcel-pickup services. The population in the dormitory area is dense, and the capacity of PPLs is limited. Due to the high machine cost, it is not possible to set up several facilities of PPLs to meet the great demand there. Moreover, the nature of PPLs is more inclined toward that of a public service facility, and their economic benefit is limited. The dormitory management prefers to lease the land to commercial shops rather than PPLs, to obtain more rent.
Another interesting finding was that the variable of population density was not selected as the critical factor for determining the locations of PPLs in the study. It was somewhat different from the previous studies that proposed that the density of PPPs had a strong positive correlation with population density [20,21]. The reason for this may be that the research scales used were different. The previous studies were based on the unit of the administrative boundary. According to the characteristics of the existing locations of PPPs, the relationship between the density of PPPs and the density of various factors in each administrative unit of the study area was investigated using correlation analysis [21,23,24]. The analysis focused mainly on the quantitative relationship and ignored the relationship with the location distances of various factors. For distance analysis, the statistical method was widely used to determine the distance range between the location of most PPPs and the surrounding features. Unlike previous studies, this work attempted to model the locations using raster units. The locations of existing PPLs and random non-PPL points were used as the training and testing datasets. The features of existing PPL locations were extracted by the ML algorithm and generated the best model. Other unknown raster units were classified into suitable and unsuitable sites by the model. This method considered both the number and distance-related factors, and the model could further identify locations suitable for PPLs rather than only analyzing the characteristics of existing points. The model could distinguish variables that yielded locations suitable for PPLs from a large number of candidate factors using Wald values (importance). Another reason was that the raster population data were not highly accurate and only considered the nighttime population. The population density data source used here was the population prediction data in the WorldPop dataset developed by the WorldPop Project. Up-to-date raster data for population density with a high resolution were hard to obtain. The predicted population in the WorldPop dataset was simulated from the official census population data and nighttime satellite images [53]. In the best model of this study, the most critical variables that yielded suitable PPL locations included Dis_Res_Qua, Dist_Busstop, and Dist_Com_OffB. These factors also had a strong relationship with the population.
There are several assumptions and limitations in this study due to the insufficient data. First, the existing PPL locations are considered as locations suitable for PPL. These locations of points serve as the sample for the ML algorithm, which learns their features. However, they may not be consistent with the actual suitability. Only current POI data are available, not historical POI data. It is impossible to analyze the relationship between the historical PPL locations and the surrounding environment. In addition, because PPL usage data are not available, it is not possible to determine whether the existing PPL locations are realistically appropriate. Second, the competition of PPLs was not considered in this study. In reality, PPLs are operated by different companies, and they may compete with each other. Third, the 27 candidate factors in the model are social and location-related factors; market and user-behavior preference factors are not included in this study.
Moreover, a metropolis is a large city consisting of a densely populated urban core and less-populated surrounding territories under the same administrative jurisdiction [54]. The PPL density is also unbalanced in different areas of a metropolis. Future research could divide metropolitan areas into multiple zones according to population density for modeling and further analyze the differences in the variables chosen by the model in the various zones.

Conclusions
Previous studies of SSA for PPLs commonly addressed the site-selection problem with given sites in a specific area [16,17]. Few studies have focused on the site-search problem with quantitative models. GIS-based SSA techniques were widely applied in urban planning activities with multiple factors. ML method was superior to the other two approaches of GIS-based SSA and worked best for problems involving enormous datasets. The LR model was the most common and explainable model of the data-driven ML algorithms. This paper proposed a GIS-based bivariate LR model with supervised classification algorithms for the SSA of PPLs and explicitly identified the boundaries of suitable areas. The micro-scale raster provided the basic unit of observation, and the suitability classification was conducted in each raster. The crucial factors and their weights were determined using the training data. Of the data, 30% was used to test the model's accuracy and evaluate the performance of the best model. The two stepwise methods (FSSM and BSEM) were employed to determine the optimum combination of variables from a total of 27 candidate variables. The performance of the LR models was evaluated based on their discrimination, calibration, and optimization rates. The results indicated that the FSSM with fewer variables had an absolute advantage in model optimization. Although the BSEM selected more variables than the FSSM, there was only a slight improvement in other indicators.
From the 25 potential variables without multicollinearity, eight crucial variables were chosen by the final LR model. Three variables were the distances to various types of buildings. The proximity to residential buildings was more important than that to commercial buildings. The most crucial factor was the proximity to residential quarters, whose importance was twice that of land price and proximity to a bus stop. The result was consistent with the preferences of customers for PPLs being located near their home addresses [52]. This study further supported the idea that the residential quarter was the most important among the four types of residential buildings, while the dormitory and CC types were relatively unimportant. The final model identified the boundaries of areas suitable for PPLs, accounting for 8% of the total area of GZ. The site-selection ranges for PPLs could be focused on these areas, which significantly reduced the difficulty of analysis and time costs. There were several limitations in this study due to the insufficient data. Future research should divide metropolitan areas into multiple zones for modeling and analyze the differences in the variables chosen by the model in the various zones.

Data Availability Statement:
The data presented in this study are available from the author upon reasonable request.