Site Selection of Fire Stations in Large Cities Based on Actual Spatiotemporal Demands: A Case Study of Nanjing City

: The rapid expansion of cities brings in new challenges for the urban ﬁreﬁghting security, while the increasing ﬁre frequency poses serious threats to the life, property, and safety of individuals living in cities. Fireﬁghting in cities is a challenging task, and the optimal spatial arrangement of ﬁre stations is critical to ﬁreﬁghting security. However, existing researches lack any consideration of the negative effects of the spatial randomness of ﬁre outbreaks and delayed response time due to trafﬁc jams upon the site selection. Based on the set cover location model integrated with the spatiotemporal big data, this paper combines the ﬁre outbreak point with the trafﬁc situation. The presented site selection strategy manages to ensure the arrival of the ﬁreﬁghting task force at random simulated ﬁre outbreak points within the required time, under the constraints of the actual city planning and trafﬁc situation. Taking Nanjing city as an example, this paper collects multi-source big data for the comprehensive analysis, including the full data of the ﬁre outbreak history from June 2014 to June 2018, the trafﬁc jam data based on the Amap, and the investigation data of the ﬁreﬁghting facilities in Nanjing. The regularity behind ﬁre outbreaks is analyzed, the factors related to ﬁre risks are identiﬁed, and the risk score is calculated. The previous ﬁre outbreak points are put through the clustering analysis, the spatial distribution probability at points in each cluster is calculated according to the clustering score, and the random ﬁre outbreak points are generated via the Monte Carlo simulation. Meanwhile, the objective emergency response time is set as ﬁve minutes. The average vehicle speed for each road in the urban area is calculated, and the actual trafﬁc network model is built to compute the travel time from massive randomly-distributed simulated ﬁre points. The problem is solved by making the travel time for all simulated demand points below ﬁve minutes. At last, the site selection result based on our model is adjusted and validated, according to the planned land use. The presented method incorporates the view of the spatiotemporal big data and provides a new idea and technical method for the modiﬁcation and efﬁciency improvement of the ﬁre station site selection model, contributing to a service cover ratio increase from 58% to 90%.


Introduction
Urban fires have been ever-increasingly frequent in recent years due to the deepened conflicts between urban population, resources, and the environment.The number of fire cases in China reached 252,000 in 2020, which is 65% higher than that in 2012.The consequent direct property loss in 2020 was 4.0 billion yuan, which is 83% higher than that in 2012.These fires seriously threaten the safety and property of individual lives, and they severely influence normal economic activities.In this context, it is of great importance to enhance urban firefighting efficiency.The fire risk of large cities is tremendously complex.The rapid development of large cities has resulted in massive accumulated risks, while the continued development brings in various new fire rescue risks.New-type disaster drivers interact with conventional ones; high-rise and large-volume buildings greatly increase; new materials and new products are extensively applied; population movement is ever diversified; and uncertainties and uncontrollable factors of outbreaks of fires grow significantly.Thus, prevention and control difficulties and extreme requirements are enhanced upon response time.Given this, the layout of fire stations is of utmost importance.Rational layout of fire station is of great significance for improving firefighting security and disaster prevention.
As is commissioned by the Nanjing Public Security Fire Bureau, this study aims to apply the location set cover model that is integrated with spatio-temporal big data to urban planning, based on the five-year fire data from June 2014 to June 2018.In this way, a novel site selection model for emergency service facilities can be built.It can then be integrated with random spatial demand simulation methods and real-time traffic networks to consider the influences of event randomness and traffic conditions that are often ignored in previous models.The availability of firefighting and traffic spatiotemporal big data lays an accurate data foundation for the modeling, which is of crucial importance for the site selection of fire station.The research findings provide sufficient theoretical support for "Nanjing fire station layout planning" that has been implemented since 2020 by the local government.
In terms of the structure, this article consists of five parts: the first part is a literature review; the second part is the research method and modeling; the third part introduces the research scope and data sources; the forth part presents the detailed analysis process, test results, and verification; and the fifth part is discussion and conclusions.

Literature Review
It has been a continuously important topic to investigate fire station arrangements in cities.The U.S. and Europe have started studying the layout of fire stations and developing relevant firefighting plans and regulations since the 1970s.Code for Planning Urban Fire Control of China stipulates that, within the urban development land, the principle of the ordinary fire station arrangement is to ensure the arrival of firefighting trucks within five minutes.Moreover, taking the road and traffic conditions in the downtown and marginal areas of cities into consideration, the layout of fire stations shall enable a firefighting administration zone of 4-7 km 2 in the downtown urban area with a marginal urban area no bigger than 15 km 2 .The conventional fire station arrangement theory is mostly based on the graph theory.The planning area is divided into the basic firefighting units, and the center of each unit represents a node in the corresponding fire outbreak position network.By doing so, the site selection problem is converted into the problem of coverage or the minimum distance of facilities to multiple point sets.The early location models, represented by the covering model, the P-center model, as well as the P-median model and its modified models, use the geographic center or key points of each zone to replace the demand point set.Consequently, the site selection process considers only the expansion of coverage with the shortened distance [1][2][3].It has been pinpointed that planning based on the travel time or travel distance of firefighting trucks cannot keep pace with the city development and that the fire station layout shall consider more factors [4].Moreover, multi-objective programming models are proposed to involve more factors such as the water supply for firefighting and political interference [5].Chen and Zhao [6] introduce the constraints such as the weather and terrain into their case simulation and build the comprehensive objective function of facility site selection via the analytic hierarchy process.Similarly, some scholars apply the multi-objective model to the fire station selection in Samia, Canada.These models attempt to introduce the factors that may trigger fires, and yet they still optimize the fire station layout with respect to the average demand.This may lead to massive firefighting activities in high-risk zones and, in the meantime, much fewer activities in the low-risk zones.The firefighting response time also varies with fire risk levels as well as road and traffic conditions in different zones.Therefore, the aforementioned fire station site selection process would tend to result in high randomness in serve zone demarcation and sparse distribution of fire stations.
With progress in developing multi-objective models, random factors are gradually incorporated in site selection for emergency responses [7,8].Specifically, random factors generally derive from the demand and traffic uncertainties induced by the spatiotemporal environmental dynamic variation.Hence, how to simulate the actual demand to evaluate the fire risk and how to overcome the traffic network restraint have become the key spots of research.Efforts have been made to investigate the index system and methodology of the fire risk evaluation [7,9], which is normally achieved from four aspects, namely the fire hazard source, the firefighting facility construction status, the firefighting management status, and the regional disaster relief capability [8,[10][11][12][13][14][15][16].The evaluation results are then applied to optimize the site selection model of fire stations [17].
At present, the evaluation methodology mainly includes the expert scoring method, fire-grade hierarchy method, risk matrix method, and the fire risk index method [18][19][20][21].Among applications of such methods, the studies carried out by Wang et al. [22] and Xu et al. [23] to perform risk grading by land uses via an analytical hierarchy process are relatively representative.Recently, it has been common to identify the factors related to fire risks from the historic fire data owing to the rapid development of GIS, data mining, and visualization techniques.Xu et al. [24] applies the kernel density analysis to the city fire risk ranking based on fire outbreaks in 2013-2016, which confirms the superiority of the site selection results based on the evaluation index of fire risk level to those of the conventional model.Zhang [25] and Lin et al. [26] demarcate the urban risk zones and perform comprehensive evaluations of the city protection grade, using the fire data and kernel density of the key objects for safety; the zone with a higher grade is given priority in fire station site selection.Wang [27] established the classification of fire risk level based on the nuclear density of POI facilities, and substituted the SAVEE model for site selection planning.Fire station selection based on the evaluation results of fire risks greatly improves the effectiveness and service rates of fire stations.Nonetheless, fire outbreaks as random events are theoretically subjected to uncertainty.In other words, every space or building has the possibility of being on fire and shall be regarded as the potential fire outbreak point.Furthermore, fire outbreaks are attributed to numerous factors, and thus the hot-spot zone of fire outbreaks cannot fully manifest the outbreak risks, though it is, to some extent, representative.Moreover, the existing fire station selection models generally fail to consider the effects of traffic jams.The traffic network is in most cases abstracted into the road topological map, and the response time or service zone is calculated using the minimum weighted distance model [28][29][30].It should be noted that the urban firefighting access way is the important premise for the rapid arrival of firefighting taskforces to the scene of the fire and thus the reduction and relief of fire disasters.The urban traffic network construction and traffic condition are important factors affecting the fire station arrangement and its optimization.Although there are studies considering the traffic capacity of the city road upon the travel speed of emergency rescue trucks, which is defined as the road design speed or assuming speed in these studies [31,32], they ignore the delay that may be caused by the ever-growing traffic jam in the city and the great threats to life and properties derived from the extended travel distance of firefighting taskforces.Wang [33] collects the data of the traffic performance index of Shenzhen over three years, and maps the daily traffic jam.Mao [34] and Ming [35] collected traffic data for one hour in the morning of June 2020 and one week in May 2020, respectively, to evaluate fire vehicle speeds based on Amap API.Specifically, the data collected by Ming [35] only have three constant speeds for three different traffic congestion states, and thus they are not representative.
To conclude, much progress has been made in addressing the fire station site selection issue based on models that consider temporal-spatial demands.However, there are two prominent shortcomings in these models.(1) The randomness of the spatial demand distribution is not sufficiently discussed.For random events, such as a fire, the existing site selection model mainly uses fire-related factors to divide the risk level.However, this is only an analysis of historical data, and the obtained site selection plan cannot be ensured to be valid during the whole planning period.Although certain efforts have been made to incorporate the spatial randomness factor into the site selection model, there is still a lack of a random simulation method for spatial demands.Specifically, such a method is expected to not only accurately describe the demand distribution but also effectively optimize the model based on the description.In this context, the site selection issue can be better solved.(2) Existing site selection models generally lack consideration of travel time and distance.Some studies have demonstrated the impact of traffic conditions on the accessibility of emergency stations by obtaining traffic data through APIs, and they have also have tried to include short-term data into the models.However, there are still no sufficiently efficient site selection methods that can stream long-term data and include traffic factors quantitatively into the model.

Methodology
This study targets the urban area of Nanjing City and builds the spatiotemporal dataset of multi-source data, including the fire history from June 2014 to June 2018 as well as the real-time traffic data and traffic network data via the Amap OPEN API.Moreover, this study investigates the regularity behind fire outbreaks and identifies and selectively incorporates the risk factors into the fire risk evaluation system.Based on the entropy weight method, the risk factors of the different fire types are normalized and the weight of each factor is obtained.After meshing the urban area of Nanjing, the fire risk value distribution across the grid cells is identified via superposition.Meanwhile, the fire data of Nanjing in 2014-2018 are put through the clustering analysis, and the probability distribution at each point of the cluster is calculated using the average score.The Monte Carlo simulation is performed to map the random fire outbreak points, and then these massive randomly distributed simulated fire outbreak points are inputted into the actual traffic network model, which is finally transferred into the location set cover model constrained by the land use plan to solve the site selection problem (Figure 1).The presented method overcomes the difficulty in mapping the fires caused by the complex fire distribution and meanwhile eliminates the error induced by the idealization of the traffic situation.It is worth noting that this research project is granted by the Nanjing Fire Department and the Nanjing Bureau of Planning and Natural Resources.The research findings provide important guidance on the actual site selection of fire stations of Nanjing, and the presented method, a novel method for urban fire station site selection in the big data era, is practical to provide references for analogous specialized planning of other cities.Fire station site selection based on fire risk evaluation can greatly improve the effectiveness and service rate of fire stations.

The Improved LSCP Model
At present, the frequently used site selection models include the set cover model, the P-median model, and the P-center model.The set cover model is in essence a model dealing with the optimal site selection for discrete points, which are often the identified distributed demand points.The site selection based on the set cover model needs to comprehensively consider multiple factors such as the quantity and position of the facility placement and the economic effectiveness.The set cover model can be divided into two types by the corresponding objectives, namely the set cover model and the maximal covering location model.The former is first proposed by Toregas et al. [36,37] and aims at the minimum facility or construction costs under the premise that all demand points are covered.Subsequently, Church and ReVelle [38] develop the maximal covering location model, based on the location set cover model.The maximal covering location model targets the facility layout that facilitates the maximum served demands under the premise of the known service station quantity and service range.The most important difference between the maximal covering location model and the set cover model is whether or not the facility quantity is considered.In addition, the former highlights serving demands, while the latter emphasizes the minimum cost.The P-median model, proposed by Hakimi [39], aims at minimizing the total distance between each demand point and the corresponding facility, so as to realize the best overall service performance of fire stations.Then, Hakimi modifies the P-median model and develops the P-center model [40]; the optimization objective of which is to minimize the maximal distance for all demand points to the corresponding facilities.The P-center model can result in a more distance-balanced layout of facilities.Given the goal of the fire station planning in China to realize full coverage of the rescue and disaster relief network across the urban and suburban areas and the reality, this paper adopts the location set cover (problem) model (LSCP) as the basic model.LSCP is one of the most important site selection models for locating emergency facilities [41].Toregas et al. [36] for the first time apply LSCP in locating fire stations, which is shown below: where V is the set of the demand zone; W is the set of facility points; i is the demand point sequence number; j is the facility point sequence number; N i is the set of the facilities that can serve the demand point i; x j is a variable equal to one or zero, representing whether or not to build the j-th facility.
The objective function Equation (1) requires a minimal quantity of the service facilities.The constraint Equation (2) states that each demand point shall be served by at least one facility.Equation ( 3) is constrained by the values of the variable.Equations ( 1)-( 3) constituent a discrete model, requiring the input of a series of spatial demand elements (including points, lines, and planes) and the location set of potential facility sites.x j represents the node, and j refers to being chosen to build a facility (x j = 1) or not (x j = 0).The LSCP model realizes coverage of any continuous space by placing a minimal number of facilities in some locations, which thus requires determining the variable value (namely the demand point) and solving x j .However, the basic location set coverage model is not fully applicable to the fire requirement characterization.The model is improved according to the characteristics of fire needs.(1) A random simulation point i is used to replace the original event point.(2) The ideal travel time from the fire station to the demand area (t ij ) should be shorter than the target time (t r s ) in the road network model (T) based on traffic congestion corresponding to any simulated demand point.The specific steps of the optimized location set cover model algorithm are as follows (Figure 2).

Generation of Random Demand Points
The overall methodology of this research is summarized below.First, the spatiotemporal features are analyzed using the historic fire outbreak data, the fire outbreak factors are identified, and each index is weighted using the entropy weight method.The study area is ten meshed into spatial grid cells, and the weights are assigned to each grid cell to obtain the fire outbreak risk value of each grid cell.Second, the fire outbreak points are put through the k-means clustering analysis, which generates several clusters, each with a cluster center; the point coordinates in the cluster are checked for compliance with the Gaussian distribution; the mean value and variance are calculated; the average score across a cluster is computed; and the probability distribution at the point in the cluster (interval) and the number of points generated by each cluster are calculated.Finally, random fire outbreak points are generated via the Monte Carlo simulation based on the mean value and variance to calculate the confidence level.
Further explanation is needed.(1) The meshing method.The study area needs to be meshed to study the potential distribution of different fire risk factors in space.Traditional gridding methods are normally based on regular quadrilateral units.In contrast, this study meshes the study area into 2784 closely connected honeycomb units that are regular hexagons with side lengths of 1000 m in ArcGIS10.6.Although the hexagonal shape is more complicated, its advantages are also very prominent: (i) it can reduce the sample deviation caused by the boundary effect of the grid shape; (ii) it is the most circular polygon and can be inlaid to form a uniform grid; and (iii) its pattern can be accurately recognized.
(2) The entropy method.In order to minimize and avoid subjective factors and some objective limitations in the process of weight determination, this paper uses the entropy method to assign weight to each index.Entropy is originally a thermodynamic concept in physics, and it can reflect the degree of chaos in the system.In the information theory, entropy is a measure of the degree of chaos in the system, while information is a measure of the degree of order.Entropy and information have the same absolute values but different signs.In the index data matrix X = x ij n * m that consists of n plans to be evaluated and m evaluation indexes, a large degree of data dispersion corresponds to smaller information entropy, which means a larger amount of information, thus higher importance to the comprehensive evaluation, and consequently a larger weight.In this way, the index weight can be scientifically assigned to solve the problem of information overlap among multiple indexes.Practically, this study first evaluates the degree of dispersion of each sample data, then uses information entropy to determine the index weights, and finally assesses the fire risk factors in the urban space.The calculation steps are as follows: (a) Standardizing the original positive index data: x /s j where x ij is the original value of the i-th sample and the j-th index, x ij ′ is the standardized index value, x and s j are the average and standard deviation of the j-th index, respectively.(b) Quantifying all indexes in the same way and calculating the weight of the i-th factor in the j-th index (p ij ): where n is the number of samples (indexes) and m is the number of indexes.(c) Calculating the entropy value of the j-th index (e j ): e j = −k∑ n i=1 p ij ln(p ij ) where k = 1/ ln(n) and e j ≥ 0. (d) Calculating the difference coefficient (g j ) of the j-th index: g j = 1 − e j (e) Normalizing the difference coefficient and calculating the weight of the j-th index (g j ) : w j = g j / ∑ m i=1 g j (j = 1, 2, . . ., m) (3) The Monte Carlo simulation.This is a common computation method based on probability statistics, also called the random sampling/statistical testing method.Its principle is that the probability of an event can be estimated using the occurring probability of this event in a large number of tests.There are several reasons for choosing this method: (i) it is convenient to perform a large number of repeated sampling of all spatial data, and it can simulate the dynamic relationship between variables randomly, thereby solving uncertain and complex problems; (ii) it is of high applicability and is less constrained by the problem conditions than other numerical methods; and (iii) when performing numerical calculations, its convergence speed is not related to the problem dimension.The assumption function is presented below: where (X 1 , X 2 , . . .X n ) represents the known spatial distribution probability of fire outbreak points in each cluster after performing the clustering analysis of fire outbreak points over five years.
In most cases, it is very difficult to calculate the probability distribution and its mathematical features of Y via an analytical process.The Monte Carlo method using a random number generator can perform a great amount of repeated independent random sampling for the random variable, by generating a set of values of the random variables via direct or indirect sampling, and can produce the probability distribution of the function Y (namely, the simulated fire outbreaks) that is close to the reality.At last, these sampled values are substituted into Equation (6) in Step 3.2.3one set after another until the ultimate results are obtained.

Traffic Model T Incorporating Traffic Jam
In site selection of firefighting facilities, precisely calculating the traffic cost can result in increased accuracy of evaluation upon the firefighting service efficiency and capacity.This paper, with the help of the big data analysis, incorporates the traffic jam factor into the site selection process, and thus produces an optimized planned fire station layout.Both data acquisition and processing are implemented using Python3.6.The obtained data are from the real-time traffic speed and congestion status data of Nanjing's road network provided by Amap Maps from 28 May to 10 June 2018, with a data collection interval of 1 h and thus 24 pieces of data per day.Amap Open Platform Web Service API function is used, and the request parameters include user authority identification, key, query road level, return data format type, callback function, and the longitude and latitude coordinate pairs of the lower left and upper right vertices of the rectangular area to be queried.Among these parameters, key refers to the authorization key that the user applies for on the official website of Amap Maps; query road level, return data format type, and callback function are all set as the default values.The innermost distance in the rectangular area to be queried is required to be less than 10 km, and due to this limitation, the Nanjing city area is segmented into 230 rectangular units (each 0.06 • by 0.06 • ).Then, the units are merged into the actual regional traffic network, and the raw data are pre-processed to link the effective traffic situation information.In the meantime, the traffic situation acquisition program is set to auto-run at the interval of one hour (3600 s) for 28 days in a row, and the intelligent batch processing of streaming data is achieved, thus allowing for automatic acquisition, pre-processing, spatialization, and storage (readable in the shapefile format) of the traffic data of the whole city.By doing so, the spatiotemporal dataset of the traffic situation is built.At last, the characteristic speed of each road is extracted via dimensionality reduction and mapped for visualization.
As the speed obtained by the Amap Map API is based on real-time road conditions, it is vulnerable to influences of holidays, major traffic accidents, and traffic jams.However, it is critical during the traffic feature extraction by highlighting the stable traffic connections and travel time between facilities and demand points.Therefore, it is necessary to measure the difference and stability of the transit time obtained by the Amap Map API at different times.Accordingly, this study chooses to collect vehicle speed data for the two weeks from May 28 to 10 June 2018, and average them to each hour.Then, the vehicle speed data of these two weeks are compared with those of weekdays and weekends, respectively.The covariance calculation shows that the average speed distribution curve during the studied two weeks can be well fitted.Therefore, it can be inferred that the difference in the average speed between weekdays and weekends does not affect the analysis of this study.The all-day congestion delay index of Nanjing is calculated to be 1.55, and the average speed of the whole day is about 28.29 km/h.In contrast, the congestion delay index during the rush hour is up to 1.81, and the corresponding average speed is as low as 24.21 km/h.The urban traffic congestion problem in Nanjing has become a serious issue to be solved.However, it is not suitable to use the congestion speed for the site selection model, because that will bring inefficient resource allocation.As a countermeasure, the characteristic speed in this model is set as the two-week average speed of each road to represent the traffic difference in different regions.Multiple small fire stations will be considered in congested areas in our future research.

Site Selection Model Based on Random Simulated Demand Points and Traffic Characteristic Speeds
The determined demand points and the traffic network model T are then substituted into the location set cover model to compute x j and the set of the ultimate planned station site set W. For the existing fire stations (corresponding to the set H), the default operation is to directly merge them into the set W (in other words, no demolition or relocation).Owing to the predictability of fire outbreaks, the optimized algorithm of the set cover location model is used to determine whether or not to include the station candidate j in the set M to the set W one after another and ultimately produce the fire station distribution with the minimal station quantity and maximal firefighting efficiency improvement.The mentioned algorithm is presented below: min ∑ j∈W x j (5) s.t.: Pr t ij ≤ t r s ≥ α (6) where i is the sequence number of the simulated demand point; j is the sequence number of the new site candidate; V is the set of the demand zone; M is the set of the new site candidates; W is set of the facility sites planned to put into service; H is the set of the existing stations; t ij is the actual travel time from the fire station j to the random simulated point i in the demand zone, derived from the traffic network model T based on the actual traffic jam situation; x j is a variable equal to one or zero, representing whether or not the site candidate j will be put into service (putting into service corresponds to x j = 1, and otherwise x j = 0).The objective function Equation ( 5) requires a minimal quantity of the constructed service station.The objective function Equation ( 6) is a global constraint requiring that, for any simulated demand point, the ideal travel time (t ij ) from the corresponding fire station to its service zone shall be lower than a specified time (t r s ), according to the traffic network model T considering the traffic jam.Calculating the actual travel time t ij from the fire station j to the random simulated point i in the demand zone needs to determine the corresponding distance and travel speed.Here, we first change the random demand point generated in Step 3.1 with the potential and existing fire station, convert their coordinates into a unified system, and export them into the spatiotemporal dataset.Subsequently, the single-source shortest path between two points in the weighted graph (namely the traffic network model T) is calculated using the path solver based on the Dijkstra algorithm, which is divided by the traffic characteristic speed of the corresponding road to yield the travel time.The travel time results are converted into the time matrix t ij .This constraint requires the response probability above 90% over the service range (considering errors, α is artificially set as 90%).This paper takes Nanjing city as an example, and accurately dissects the spatiotemporal characteristic of firefighting events, based on the actual demand points of firefighting events and the traffic network data.Thus, we manage to perform more precise measurements of each variable in the equations and develop a more reasonable site selection plan.The next chapter introduces the fire data of Nanjing and the data sources.

Data Sources 4.1. Study Area
Nanjing, the capital city of Jiangsu Province, has an urban area of 1364.85 km 2 .By 2020, there are a total of 57 fire stations that have been built and put into service in Nanjing (including 2 special-duty fire stations, 54 normal fire stations, and 1 firefighting support fire station).A total of 39 stations are located in the urban area, while 18 lie in the suburban area.There are no underground fire stations.The current distribution of the effectivelyoperated fire stations is generally characterized by the high density in the central urban area and low density in the developing urban fringe area.Xuanwu, Gulou, Qinhuai, Yuhuatai, Jianye, Qixia, and Pukou districts are found with more existing fire stations (points), correspondingly associated with relatively concentrated distributions (Figure 3).The Standard for Urban Fire Station Construction (MOHURD Standard 152-2017, a national sector standard of China) stipulates that the service area of the normal fire station in cities should be no larger than 7 km 2 , and that of the normal fire station in suburbs should be no larger than 15 km 2 , which means there should be at least 114 fire stations in Nanjing.The quantity of the existing fire stations is far less than the stipulated one.The service area of the fire stations is, in most cases, overwhelmingly large, and the five-minute arrival required by the standard cannot be realized (at present, the average travel time to the fire site is about 11.6 min).The service area of an individual fire station far exceeds the upper limit of the service area.Moreover, the traffic jam of the city is intensified over recent years.Thus, the optimal response time for rescue and disaster relief cannot be guaranteed.It should also be noted that the land that can be used to build fire stations is in short in the city, which leads to the uneven distribution of fire stations and considerable firefighting dead zone.For example, the South New Town centered at the Nanjing South Railway Station and the Maqun area in the eastern Purple Mountain are found with no placement of fire stations.

Fire Outbreak Data 4.2.1. Basic Characteristics of Fire Outbreaks
Optimization of the fire station layout needs to consider the fire characteristics in Nanjing.With the help of the Nanjing Fire Department, we collect the fire outbreak time, fire location, fire site, and cause of fire recorded by the Emergency Call for Fires (119) from 1 June 2014 to 1 June 2018, and identify a total of 9561 pieces of fire data (Table 1).The location information of all fire events is translated into the latitudes and longitudes, based on which the fire events are spatially positioned in the hexagonal grid cells with radii of 500 m meshing the Nanjing urban area.A summary of kernel density analysis on the fire outbreak quantity in each grid cell reveal that the hot-spot zones of fires are mainly located in the core areas of the central urban area (including Gulou, Xuanwu, Qinhuai, and Jianye districts), which is the intensive core of fire outbreaks.The global spatial autocorrelation coefficient reaches 0.875, suggesting the extremely high spatial aggregation of fire outbreaks.The major peak occurs at the Xinjiekou area, around which the fire outbreaks gradually decline radially towards the opposite direction, and meanwhile, another peak of fire outbreaks occurs at the outward Gaochun area (Figures 4 and 5).From a temporal point of view, the fire outbreaks are obviously periodic.Regarding months, July and August are associated with periodic high fire outbreaks, resulting in nearly 200 fire events over the last three years.However, July-August in 2014 presents itself as a rare valley of fire outbreaks, which is attributed to the fire security circles (for fire prevention and facility protection) set up by the Nanjing Municipal Government for the Youth Olympic Games held at that time in Nanjing.Besides, January and February in winter are the secondary periodic peak for fire outbreaks.April in 2014 is seen with aperiodic ultra-high fire outbreaks (360 outbreaks).

Fire-Triggering Factors
The cause and site of fires are analyzed using the dominance index.The electrical fire is the main cause of fires for all districts, accounting for 76.3%, followed by the careless living fire activity (11.1%), the production operation fire (4.5%), and self-ignition fires (2.5%).The electrical fire is mainly triggered by electrical circuit failure, electrical equipment failure, and electrical heating devices, and often occurs in summer and winter.The high ambient temperature (plus the self-heating of the running electrical equipment) and thunderstorms in summer promote occurrences of electrical fires.In winter, the overhead electrical lines are prone to contacting and connecting driven by intensive winds and discharging electricity to cause fires.Moreover, inappropriate applications of heating devices and burning inflammable materials are also the characteristic causes of fire in winter.These analysis results are consistent with the temporal characteristics of fire outbreaks mentioned above.Weather, hazardous goods, gas pipe network distribution, and population distribution are all main factors related to fire outbreaks.
Furthermore, the fire outbreak sites are divided into five types, namely the residence land, industrial land, facility land, commercial land, and land for plazas and plants.The uppermost outbreak site is the residence land, accounting for 71.2%, while the shares of the other four types of lands are 4.5%, 12.7%, and 2.4%, respectively, and other types of fire account for 9.2%.Residence fires are mainly caused by electrical failures, with a correlation coefficient of 0.725.Their frequency distribution of daily outbreaks is found to conform to the exponential law.In other words, for most days, the daily fire outbreaks are very low, and yet for some specific dates, the fire outbreaks dramatically grow.The presented analysis further reveals that fire outbreaks are highly dependent on the residential land and population distribution, and present a certain regularity.

Underlying Fire Risk Evaluation
The fire risk evaluation of the urban area involves various aspects, such as the occurrences, development, control, and firefighting and rescue in the urban area, and is characterized by numerous and complex influential factors.The fire risk evaluation calculates the fire outbreak probability, predicts the disaster consequence, and quantifies the fire risk, by analyzing the factors affecting fires.It can provide scientific references for developing the urban firefighting plan and direct the urban fire safety management to improve the resistance of cities to fire disasters.By far, the previous urban fire risk evaluation cases in China and other countries mostly focus on the evaluation index system establishment, the evaluation model and its application, etc.The risk ranking of the U.S. considers risk factors such as the application scenario of architectures, building density, and fire separation, while the "urban ranking system" of Japan mainly considers the type and structure condition of buildings, climate condition, and firefighting system [8,10].Ding and Wang [42] choose the population, firefighting infrastructure, firefighting capacity, public security condition, and major hazardous source as the five primary factors.Zhang [11] further refines the risk factors into the firefighting key area, population density, high-rise building distribution, large crowded underground space, and firefighting taskforce to build the fire risk evaluation framework, from the perspective of the urban space, and identifies the high, medium, and low-risk areas.In general, when it comes to building the evaluation system, previous studies often focus on investigating and refining the fire hazardous source, firefighting capacity development, firefighting management status, and regional disaster-resistant capacity.The relevant studies are increasingly mature, and yet with the ever-complicated development of large cities, the urban fire risk evaluation shall highlight the spatial analysis of each underlying factor, quantify the risk fire evaluation into the material space, and precisely rank the underlying fire risks of each urban land blocks, in order to realize refined management of the urban firefighting security.Hence, based on the previous studies, the features of this research, and the characteristics of the fire outbreaks in Nanjing, the population density, high-rise building distribution, underground space distribution, site distributions of gas facilities, and hazardous chemical substances, and historic fire outbreak frequency are determined as the six major underlying factors for evaluation.The weight and score of each factor and the comprehensive score distribution across the urban area are calculated using the entropy weight method (Table 2, Figures 6 and 7).This research simulates the underlying fire outbreaks and assumes that the firefighting capacity is limited to putting out fires and rescue.Thus, the firefighting capacity is not included as an evaluation factor.Fire hazardous source Gas pipe networks 0.24 Hazardous chemical substances 0.14 Regional disaster resistance Underground space distribution 0.12 High-rise building distribution 0.17  By comprehensively considering the urban land use plan and the current land use and construction status and eliminating the positions that are unfavorable for building fire stations (e.g., hills and lakes), the land blocks over 2000 m 2 (according to Standard for Urban Fire Station Construction, a national sector standard of China) are screened out.There are a total of 4084 region blocks, which are defined as the site candidate set M.

Clustering and Simulation of Historic Firefighting Data
The three-dimensional K-means clustering analysis is carried out for the demand spatiotemporal points of fire outbreaks from 2014-2018, and the cluster quantity is K.The average horizontal and vertical coordinates µ of the relative cluster center for points of each cluster are calculated, which is combined with the fire risk value of the corresponding grid cell to form the 3D coordinates for clustering analysis.The clustering adopts the mean-variance normalization, and the clustering and Gaussian distribution fitting results of the data of 2014-2018 are used.It shall be noted that the K-means clustering requires determining the cluster quantity K, which is set as seven in this research.The main clustering process has three steps: first, K random starting points are chosen as the mass centers; second, the data in the dataset are assigned to each cluster according to the distance to the mass centers, and the averages of each cluster are calculated and set as the new mass centers; and third, the second step is repeated until there is no change for all clusters (Figures 8-10).According to the elbow method, K = 7 brings about the best classification performance for spatiotemporal features, which means no concealing features and in the meanwhile clear grouping.The data clustering and Gaussian fitting results are used for the Monte Carlo simulation to generate several random demand (fire outbreak) points.

Building the Traffic Network Model and Calculating the Minimal Time Matrix t ij
Based on the processed traffic network shapefile-format files, the traffic network dataset is constructed as the traffic network model, via topological operations, such as road intersection breaking and interface merging.The model consists of the road intervals and intersections.The road interval is the line element of the traffic network, which is represented using the arc in ArcGIS.Each interval is assigned the attributes of the jam vehicle speed (km/h), average vehicle speed (km/h), travel time at the average speed (s), and length (km).The road intersection, represented by the node in ArcGIS, is combined with the turning table for road intersections to vividly mimic the actual on-theroad scenarios, such as waiting for the traffic light, no straight-through, no left turn, and passing through the elevated road.The travel time for each road interval is calculated in ArcGIS, according to the actual traffic network and corresponding characteristic speed, which is defined as the average actual travel time.The vehicle speeds between two adjacent weeks in the study period are compared and the covariance calculation results show that the distribution of the average vehicle speed at exact hours during the two weeks can be well fitted, with the correlation coefficient of 0.973.Thus, the data of every two weeks is of periodic representativeness in our research.

Generating the Set of the Planned Sites W
It is determined that the set of the current 57 stations (the H set) is directly merged with the set W, with no station demolition and relocation.Based on the algorithm presented in Chapter 1, the calculation is made for each station site j to determine whether or not to be included in the set W, and the actual factors of Nanjing are all input into the global constraint equation.t r s is set as five minutes, and α is set as 0.90, which means that, for the clustering-based random simulated demand point j, the travel time to the station i included in the set M (the travel time is added with the OD time matrix) is less than five minutes.The problem is solved in Matlab R2021a using the genetic algorithm, and the minimal station quantity and station distribution (x j and j) that can best improve the firefighting efficiency are obtained.5.2.5.Adjusting the Model-Produced Results and Reviewing the Planned Land Use for the Fire Station Sites According to the Regulatory Plans of Each District to Ensure the Feasibility of the Ultimate Planned Fire Station Layout There are 274 ultimate planned fire stations (Figure 11), including 57 existing stations and 217 to-be-built stations.It should be noted that 28 of the model-planned fire stations are adjusted.Among these fire stations, 125 stations mainly serve the central urban area and its surrounding, while 149 stations mainly serve the urban fringe area.The average service range for each fire station is 4.3 km 2 , less than the specified upper limit of 7 km 2 .Our calculation shows that in this plan, the service area that realizes the five-minute arrival of the firefighting taskforce in the central urban area and concentrated construction area accounts for 91% of the total construction land use, and 99% of the construction land use of the central urban area.The locations of 217 fire stations have been fed back to the regulatory plan development group for approval of their land uses.The fire stations are all placed in the street-facing sections of the main and secondary main roads, more than 200 m far away from the sites storing hazardous goods, such as gas and LNG stations, and 50 m from crowded sites.The firefighting response time is decreased from the current 11.6 min to 5 min.

Discussion
(1) Optimization of the emergency facility site selection has always been a classic research topic in geography.Given that the fire outbreak probabilities are different for demand points of the varied fire risk zones, this research proposes a site selection method that manages to realize the arrival of firefighting taskforces within the target time to all random simulated fire outbreak points under the constraints of the administrative regulatory plan and actual traffic situation.The solution workflow based on the set cover model is designed, and a case study of Nanjing city is performed.Compared with the conventional model, the optimized location set cover model greatly improves the covering range and effectively reduces the firefighting response time.Our research also reveals the considerable flaws of the layout of the fire stations in Nanjing.Specifically, due to the insufficient and unbalanced distribution of firefighting resources in the districts, the current firefighting facility is far from satisfying the emergency response demand.The overall cover ratio of the fire station service is only 58%.The firefighting resources are excessively concentrated in the Gulou and Xuanwu districts.Consequently, the firefighting service has formed a large regional cover across the central urban area, while multiple firefighting dead zones exist in other current urban built-up areas.A lack of fire stations is the main reason for the large firefighting dead zone.
(2) Various fire risk factors have increasingly emerged due to the continuous concentration of population and resources in large cities.Moreover, the accessibility and rescue efficiency of the urban road network have been seriously affected by the overwhelming large traffic flow and, thus, the unsmooth fire control passages in the central and new urban areas.Furthermore, fire control facilities in many large cities of China are not sufficient and outdated.All these factors bring about severe challenges to fire safety management in large cities.This study manages to accurately assess the distribution of urban fire risks, road traffic conditions, and other situations by analyzing the spatial data.In this way, much progress has been made in effectively suppressing the negative impacts of random fire occurrences and traffic delays on site planning, thereby saving urban public resources.As more and more cities are entering a digital era, our future research will consider and integrate different tools into the ArcGIS toolbox, which will be more conducive for the relevant department to optimizing fire station planning.These results will provide references for similar cities in China and thus help enhance the fire control capacities of cities.
(3) The site selection of fire stations in this paper assumes that facilities all belong to the same level.However, firefighting facilities are in fact subjected to differentiation in types, levels, scales, and service ranges.Such an assumption to some extent results in errors, which shall be corrected in further studies.In addition, the traffic cost based on the traffic network model may change due to road modification and widening.Therefore, it is critical to adjust and update the dynamic factors in this model in a timely manner.The next step is to integrate different tools into the ArcGIS toolbox and visualize it, so as to facilitate the timely correction of the road network and update the data.Given this, we plan to collect the traffic data over a prolonged period of time and investigate the vehicle speed variation with different seasons and months to more accurately calculate the time-cost factors and substitute them into our model.We also plan to collect the fire outbreak data in 2019-2020 and import them into our model for further comparison and validation.The firefighting response time is decreased from the current 11.6 min to 5 min.

Conclusions
Digitalization in the new era has penetrated into all aspects of urban life, transportation, and medical care.In this context, the emergence and development of spatio-temporal big data provide new opportunities for the optimization of emergency site selection.This study first introduces the basic model of emergency site selection based on actual traffic conditions and the simulation method based on random demand in space.Then, this study employs the K-means clustering method to quantitatively describe and simulate fire demand based on actual data.Subsequently, an optimization algorithm is established that integrates the actual speed of the road network into the location set cover model, which contributes to the shortest travel time from the fire station to the simulated demand point at the actual speed.In order to evaluate the validity of the proposed model, this study analyzes the spatio-temporal characteristics of fire data from 1 June 2014 to 1 June 2018 in Nanjing City, identifies various necessary factors based on the modeling analysis, and solves the model with a set target time of 5 min under the conditions of land-use constraints.Compared with traditional models, the optimized location set cover model greatly improves the coverage area and effectively shortens the fire response time.This study is commissioned by the Nanjing Public Security Fire Bureau, and the results of this study provide important theoretical support for the statutory plan, i.e. the Nanjing Fire Station Layout Plan, implemented by the local government in 2020.Nonetheless, congestion status might change as the city develops, which will have an impact on the existing research results.In future work, we will regularly monitor changes in the road network status and adjust the fire station layout accordingly, including adding some small fire stations where necessary.

Figure 1 .
Figure 1.Workflow of our research methodology.

Figure 2 .
Figure 2. Technical workflow of data processing.

Figure 3 .
Figure 3.The fire outbreak locations and current fire stations in Nanjing.

Figure 6 .
Figure 6.Evaluation factors for fire risks.

Figure 11 .
Figure 11.The final layout of fire stations.

Table 1 .
Example of GPS data for emergency service events.