Location Choice of New Business Establishments: Understanding the Local Context and Neighborhood Conditions in the United States

With the continuing shift toward e-commerce, physical business locations with a brick-and-mortar presence become an endangered element of urban fabric, land use, and the local economy. City governments and local municipalities have created and implemented a variety of strategies and incentives to stimulate new business activity within their jurisdictions. A policy of enhancing the business climate is productive in some regions but not in others. To understand these variations in outcomes, this research focuses on examining the relationship between the uniqueness of certain regions, spatially bounded characteristics, and how both affect where new establishments locate. A two-level model is introduced to employ the census tract as a spatial unit of analysis and analyzes new establishments within 27 medium-sized metropolitan statistical areas in the United States. That quantitative model allows this study to determine key regional and neighborhood factors, as well as the existence of previously unmeasured factors, influencing location decisions of new establishments. The results of this study confirm the importance of economic, demographic, and geographic conditions at the neighborhood level, providing a better understanding of the vulnerability of the local economy.


Introduction
Nearly every city and town attempts to attract new businesses to support job creation and affect the health of the community. New business establishments do that by creating jobs, disseminating ideas, and potentially drawing additional businesses. Start-ups create far more jobs than incumbent firms. Between 1977 and 2005 in the United States (U.S.), start-ups created an average of three million new jobs annually, whereas existing firms eliminated, on average, one million jobs per year [1]. New businesses are regarded as the economic engine of regional growth. However, start-up activities differ across regions and across neighborhoods within a region. The purpose of this study is to examine what determines spatial variation in new firm formations.
Little research has been conducted on the spatial nature of new establishments, particularly at the microgeographical level, although there has been considerable research interest in the determinants of new establishment activities. In the field of business management, research has explored the determinants of new establishments around entrepreneurship theory. In this research approach, the focus is on business organization and individual entrepreneurs; thus, spatial factors have been observed yet have not been heavily studied. Another research strand, in the field of economics, focuses on the determinants of new establishments through economic growth theory. These empirical findings are broad or not applicable at the municipality level because they do not fully consider local characteristics or entrepreneurial ecosystems involved in encouraging establishments [2], but instead, rely on economic conditions at the macrogeographical level.
Understanding the relationship between new establishments and economic performance is key for planners and policymakers to implement strategies that stimulate economic growth. Although the relationship between new establishments and economic performance is noted, the two topics have not been highly considered as an associated issue, especially in the planning field. A primary reason for this problem is that the two topics have been established in different academic foundations [3]. While start-ups are examined as realized outcomes of entrepreneurship in the business management field, economic performance is a primary theme of economic growth in the economics field.
By providing urban conditions as the intersection point, this study bridges the gap between the two topics, as new firm formation and economic performance are ultimately spatial phenomena, and urban settings are an essential space for businesses. Spatial positioning is a significant consideration when activating establishments [4]. Hence, surrounding geographical environments play a key role in where new establishments locate [5]. In particular, clustered economic activities [6] and proximity to knowledge sources [7] have been highlighted as necessary conditions. Urban settings provide an ideal environment to satisfy such conditions [8,9]. This paper takes a careful look at the local context of new establishments [10] by examining the extent to which neighborhood conditions contribute to attracting businesses. Recent literature explores microlevel determinants and attempts to measure the geographic scope of technology spillovers by analyzing spatial clusters of venture capital investment [11], knowledge-intensive start-ups [12], research universities [13], and high-growth firms [14]. These businesses seeking office space gravitate towards targeted dense neighborhoods, which comprise less than 4% of all neighborhoods across the U.S. [11], and the locations outside these areas are underemphasized in empirical studies. By contrast, this study pays attention to the local situation in which routine and basic industries [15], such as retail and non-high-tech startups, can also locate. With the continuing shift toward e-commerce, physical locations with a brick-and-mortar presence become an endangered element of urban fabric, land use, and the local economy-all of which are associated with livable and resilient neighborhoods [16]. For example, in a situation where businesses nearby begin to gradually disappear, the local tax base continues to erode, and the average travel distance for goods and services is likely increased, leading to less efficiency in cost and time that is potentially detrimental to community resilience [17]. This research addresses these issues in a different manner by using a two-level model that considers regional and neighborhood factors to examine the full extent of the importance of local characteristics in business formation.
The remainder of the paper begins with an introduction to the theoretical background of new establishments and key findings from empirical studies. Section 3 introduces a typology of location choice behaviors and the two-stage model, and Section 4 presents the data sources and study areas. This is followed by a detailed analysis at the neighborhood level and at the regional level. The final section discusses the findings, implications, and concluding remarks.

Theoretical Background and Literature
The spatial aspects of new establishments are germane to some of the studies of entrepreneurship in the economic growth literature. The effects of new establishments on economic development have been discussed by many studies theoretically [18][19][20] and empirically [21][22][23]. The direction of causality has been a topic under debate-do start-ups influence economic performance, or is it the other way around? On the one hand, the effects of new establishments on economic growth are clear and immediate. Start-ups, for example, always generate new jobs, and this contribution is significant compared to incumbent firms [24]. In addition, the establishments are conducive to advancement in creativity, variety, and efficiency in the market due to using new ideas, providing diversity, and increasing competition, respectively. The effects of economic development on new establishments, on the other hand, are also active, although the effects are not instant. One common index of economic growth is increased GDP, and this will raise the overall demand for new services and products. Hence, there will be more chances of success for new entries due to the results of an economic upturn.
Many scholars have suggested theoretical and empirical approaches to formulate the association between new establishments and economic growth. Van Stel, Thurik, Hartog et al. [25] explained this two-way relationship by creating a conceptual model with a region's GDP. When a region's GDP is used as a measurement of economic performance, new establishments of time t will increase the GDP at the same time t. However, when the GDP affects new establishments, it will take time to be effective, such as one year t + 1 or two years t + 2, when the effect is quantified annually. As a result, very few studies discuss this mutual relationship simultaneously because of the time-lag structure [26]. To consider this situation, Audretsch and Keilbach [27] suggested an empirical model using a simultaneous equation framework.
Accounting for geographical conditions of new businesses and quantifying locational characteristics, empirical studies, mostly in economic geography, have investigated spatial variations of new firm formations using jurisdictional-level variables and indicators. The studies use aggregate measures of locational settings at the state, metropolitan statistical area (MSA), and county levels as a unit of geographical analysis. Such studies provide information regarding the determining factors of new businesses but not spatially explicit to be used for the implementation of local plans and policies. For example, Bartik [28] and Levinson [29] modeled the location choice of manufacturing plants across states using a logit model. Lugar and Shetty [30] and Couglin, Terza, Arromdee [31] looked more specifically at foreign firms investing in manufacturing plants at the state level. Their common finding at the state level is that more existing manufacturing activity attracts more establishments.
The number of MSA level studies is noticeably smaller compared to the state level. Carlton [32] and Glaeser and Kerr [33] focused only on manufacturing sectors, whereas Strauss-Kahn and Vives [34] examined relocation of headquarters for nine industries. Carlton [32] highlighted the size of the market as measured by population, Glaeser and Kerr [33] emphasized abundant workers in relevant occupations, and Strauss-Kahn and Vives [34] found that airport facilities are a positive, and corporate taxes are a negative, factor in attracting headquarters. At the county level analysis, Guimaraes et al. [35] examined total manufacturing establishments, and Becker and Henderson [36] focused on polluting industries within the U.S. using the Poisson count data model. Guimaraes et al. [35] found that higher property taxes deter start-up investments, and Becker and Henderson [36] showed the non-attainment status in terms of air quality regulation reduces the birth rates of polluting industries. Another county level study approach is modeling location choice of new establishments across counties within one state, for example, by List and McHone [37] in New York, Coughlin and Segev [38] in California, and Gabe and Bell [39] in Maine.
The studies using smaller than a municipality-level unit of geography are rare due to data availability, which will be addressed further in the empirical strategy section. At the neighborhood level, Rammer et al.'s [12] study of Berlin suggested that the geographic proximity to knowledge sources, such as universities and research institutes, plays a significant role in encouraging innovation in firms. At the ZIP code level, Florida and King [11] studied start-up and venture capital investment locations, Meltzer and Schuetz [40] analyzed retail business location patterns in New York City, and Rosenthal and Strange [41] modeled new firm formation. Florida and King [11] found that start-up and investments concentrate in dense ZIP codes, and Meltzer and Schuetz [40] showed that minority and low-income neighborhoods contained fewer and smaller retail stores than predominantly white or high-income neighborhoods. Rosenthal and Strange [41] modeled six industries using the Tobit count model, with the number of new establishments as a dependent variable. They measured the geographic scope of spillover externalities that decreased as distance increases and found that the urbanization effect works within one mile, then vanishes, but the localization effect is significant up to five miles. The spatial analysis approach developed by Rosenthal and Strange [41] served as a source of inspiration for the research design of this paper.

Methods
Since a new establishment is a comprehensive and inclusive definition of newly started business activities, additional criteria must be discussed to conceptualize a typology of new establishments. Luger and Koo [42] argued that start-ups are new, active, and independent establishments. For example, a national firm that opens a branch office in a certain city is not a start-up because this establishment is not independent, but rather, a case of new establishments creating new jobs and influencing economic outcomes of the city.
New establishments behave differently when they determine where to locate. Although divergent patterns of the decision-making process are observed, all types of new establishments can be modeled in a framework of a two-stage model. The two-stage location choice model consists of regional level decision and local level decision. In addition, the whether decision indicates a contrast in location choice behaviors across establishment types. The issue of where an establishment locates has different dimensions of choice sets, depending on who is making the decision. Unlike the assumption of homogenous firms in most models of microeconomics, companies are heterogeneous in size, industry, and organization. When start-up activities have been examined, many papers, especially in the business literature, emphasize the role of firm-level attributes. The founder of the company is a significant factor, especially in the case of a small firm, because they are a decision-making entity for every aspect of start-up activities. Figure 1 conceptualizes the decision-making process of location choice behavior by utilizing the two-stage model in company with the whether step question. For example, when a local business decides whether to open, the business is assumed to open in the region where the founder resides. When a national firm decides whether to open a new location, they must decide in which region they will locate. To provide a typology of new establishments, four representative cases are introduced: (I) local businesses, (II) home-based businesses, (III) spin-offs, and (IV) national firms. Table 1 categorizes these cases based on the definition of start-ups by Luger and Koo [42]. Luger and Koo [42] argued that start-ups are new, active, and independent establishments. For example, a national firm that opens a branch office in a certain city is not a start-up because this establishment is not independent, but rather, a case of new establishments creating new jobs and influencing economic outcomes of the city. New establishments behave differently when they determine where to locate. Although divergent patterns of the decision-making process are observed, all types of new establishments can be modeled in a framework of a two-stage model. The two-stage location choice model consists of regional level decision and local level decision. In addition, the whether decision indicates a contrast in location choice behaviors across establishment types. The issue of where an establishment locates has different dimensions of choice sets, depending on who is making the decision. Unlike the assumption of homogenous firms in most models of microeconomics, companies are heterogeneous in size, industry, and organization. When start-up activities have been examined, many papers, especially in the business literature, emphasize the role of firm-level attributes. The founder of the company is a significant factor, especially in the case of a small firm, because they are a decisionmaking entity for every aspect of start-up activities. Figure 1 conceptualizes the decision-making process of location choice behavior by utilizing the two-stage model in company with the whether step question. For example, when a local business decides whether to open, the business is assumed to open in the region where the founder resides. When a national firm decides whether to open a new location, they must decide in which region they will locate. To provide a typology of new establishments, four representative cases are introduced: (I) local businesses, (II) home-based businesses, (III) spin-offs, and (IV) national firms. Table 1 categorizes these cases based on the definition of start-ups by Luger and Koo [42].

New Establishments Start-ups Non Start-Ups (I) Local businesses (IV) Branches/Plants/Facilities of existing national firms (II) Home-based businesses (III-I) Independent spin-off businesses (III-D) Dependent spin-off businesses
The two-stage model can be a location choice framework for all types of new establishments. Simply put, new establishments can be modeled as a function of regional and local conditions. The most common approach in the literature is McFadden's maximum utility approach, developed by Carlton [32] for location choice behavior of new establishments. The expected profit function of a start-up firm can be estimated equivalently by using a count data model [43].
Number (New Establishments) = f (regional attributes, local attributes). (1) Multilevel modeling allows the regional and local level to be modeled simultaneously. This

(I) Local businesses (IV) Branches/Plants/Facilities of existing national firms (II) Home-based businesses (III-I) Independent spin-off businesses (III-D) Dependent spin-off businesses
The two-stage model can be a location choice framework for all types of new establishments. Simply put, new establishments can be modeled as a function of regional and local conditions. The most common approach in the literature is McFadden's maximum utility approach, developed by Carlton [32] for location choice behavior of new establishments. The expected profit function of a start-up firm can be estimated equivalently by using a count data model [43].
Number (New Establishments) = f (regional attributes, local attributes). (1) Multilevel modeling allows the regional and local level to be modeled simultaneously. This multilevel statistical approach accounts for the shared variance in nested structure data, whereas conventional multivariate models assume independent observations [44]. This assumption is often violated when using hierarchical data. Neighborhoods in the same region likely closely resemble each other and are influenced by the same regional attributes. The method used accurately estimates local level slopes and their implementation in estimating regional level outcomes. When establishment data from J MSA groups at the regional level is collected, each MSA includes a different number of census tracts n j at the local level. The number of new establishments at a census tract i in MSA j is the dependent variable Y ij . For a simple case, the census tract level includes an independent variable X ij (CT) , such as population density, and there is one MSA-level independent variable, Z j (MSA) . [ The intercept coefficients β 0j (CT) at the census tract level become an outcome variable in the MSA-level regression model (3). Likewise, the slope coefficient β 1j (CT) becomes an outcome variable in the MSA-level regression model (4). [ The census tract level residual e ij follows a normal distribution N (0, σ e 2 ). The MSA-level residuals are distributed as multivariate normal, with each element of uj having a mean of zero. The main difference with a usual regression model is that each MSA has a different set of intercept and slope coefficients, that is, random coefficients interacting with the census tract level regression (2).

Measuring New Establishment
Earlier empirical studies have been interested in the people who founded and operated independent businesses. As a result, self-employment rates have been widely suggested as a measurement of new establishments (e.g., [45]). In addition, self-employment rates are popular in empirical studies due to the easier availability of data. Some researchers criticize it as a measurement because of potential bias towards small-size establishments [33]. Alternatively, the number of new establishments and the amount of new employment within those new firms have been suggested (e.g., [46]). These measures have the advantage of capturing behavioral aspects of start-up activities. On the one hand, a count of new establishments is pertinent to a behavioral unit of decision-making and start-up activities. A firm is the unit that decides new establishment and subsequent location choice, not individual employees in the firm. The second metric, employment in new firms, is intended to quantify the size of enterprises and to show the number of jobs created. The amount of job creation is one of the key indicators of regional economic policy and relates to other important indices, such as the unemployment rate.
This study uses both metrics, the numbers of new establishments and new employments, and normalizes them by the size of the census tract, the spatial unit of observation. This normalization is needed to control the disparity in the sizes of census tracts. Without normalization, larger census tracts will have a higher probability of the counts of new establishments in proportion to the area of census tract according to the so-called "dartboard theory" [28].
The Dun and Bradstreet (D&B) database is the principal data source of the study and one of few firm-level data sets which cover the entire United States. The firm-level database from the U.S. Census Bureau is not open to the public, except for the papers affiliated with the U.S. Census Bureau's Center for Economic Studies, so most researches conduct firm-level analysis using the D&B. D&B has tracked establishment information since 1933 by collecting and verifying data from thousands of sources, enhancing the accuracy and completeness of the database. However, the D&B database is not flawless, and there are missing data, which is common for massive databases. Rosenthal and Strange [41], after interviewing analysts at D&B and conducting research using the database, support the accuracy of the database for empirical studies on the spatial distribution of establishments "the omissions from the data set are sufficiently random that the D&B database is representative of the spatial distribution of establishments in the United States. Moreover, the measurement error associated with the distribution of employment across industries within a given geographic zone is likely to be small if one aggregates up by even a modest amount [41].
D&B provides the following information regarding company profiles: establishment year; industry classification; the number of employees; annual sales; whether it is a headquarters or branch; and X-Y Coordinates. The study pays attention to new establishments in the year 2001 to control the causality between determining factors and start-up activities. Conditions in Census 2000 determine where new establishments locate in 2001. This one-year time lag allows this study to create a one-way causal direction and reduce endogeneity problems.

Spatial Unit of Analysis and Study Areas
This research explores the effects of regional and neighborhood conditions on where new establishments locate in medium-sized MSAs. Medium-sized metropolitan areas are defined as those with populations of one to two million in Census 2000; 27 MSAs fit this category (see Table 2 and Figure 2). These MSAs behave differently compared with large-or small-sized MSAs regarding new establishment location choice. First, a medium-sized MSA represents one set of regional attributes as assumed in the model, whereas a larger MSA represents multiple sets. Large-sized MSAs are described as having polycentric configurations [47,48] that contain several business climates. McMillen and Smith [48] and Lee [49] suggested that polycentric structures can be generated when an MSA has a population greater than two million. Second, a medium-sized MSA is often considered for new establishments or new branches of national firms, whereas small-sized MSAs are often neglected. In general, small-sized regions attract only local businesses. In this study, a population of one million is used as the criterion for small-sized MSAs, although there is no objective standard to define small-sized MSAs. Finally, researchers have examined specific mega metropolitan areas, such as New York City [50] and Los Angeles [51], but medium-sized regions are rarely studied for new establishments. This phenomenon is unfortunate when considering the roles of medium-sized metropolitan areas in many states. Medium-sized MSAs are likely more related to the domestic economy at the state level, whereas large-sized MSAs function in the world city network as well. been ignored in the microgeography literature [52]. Practically, geographically local factors have more direct effects on individual establishments compared to regional factors and, they can be improved by local efforts. Recent empirical studies have found that, for example, the average January temperature and the education level are significant regional factors that affect where new establishments locate. These regional characteristics tend to persist over time and are difficult to change in the short term.

Results and Findings
Two regression models were developed: the establishment model to estimate the number of new establishments (Model 1), and the employment model to estimate the number of jobs created from the new establishments (Model 2). A comparison between the two enables the study to understand the effect of a firm's size on its location choice because of the considerable variation in the size of the new establishments. The models use a two-level linear modeling estimation, as introduced in the methods section. The estimation approach was appropriate for the data because there is a very small number of zeros in the dependent variables. Unlike empirical studies focusing on selected industries with a high percentage of zero values, no establishments in many neighborhoods, and employing zero-inflated and Tobit regressions, this study covers all the industries. Although the dependent variables are normalized by acreage and not integer, generalized linear model specifications, such as Poisson and negative binomial, were also tested to account for count data [53], reporting largely consistent findings for the establishment model, but not appropriate to estimate the employment model and the estimation method used performed better in the model results. The Akaike information criterion (AIC) was used to assess model goodness-of-fit, and the lowest AIC was selected (see Table 5). To estimate the MSA level effect, the theories and empirical approaches in regional economics [33] are used to classify and select variables at the regional level. In addition to agglomeration economies, demographics, and natural advantages, this study considers the business climate as an additional category to quantify MSA-level traits. After testing a set of variables, nine MSA-level variables were selected for empirical estimations of the model (see Table 3). The neighborhood-level effects on new establishments are identified using variables at the census tract level within the 27 MSAs. Using census tracts for a neighborhood scale analysis is appropriate as they offer a general level of uniformity and consistent boundaries delineated by the US Census Bureau, and allows the use of Census datasets. There are 8514 census tracts within 27 MSAs with an average number of 315.3 tracts per MSA and a median tract acreage of 1.5 square miles. Variance inflation factor (VIF) was used to select variables by testing multicollinearity among variables. For example, distance to central city was excluded, whereas distance to nearest highway was included for modeling. Table 4 shows the correlation coefficients between the selected variables.   This tract-level analysis provides empirical and practical evidence for understanding the association between local conditions and establishment location choice. Empirically, this method helps to understand spatial variation in new establishments across neighborhoods, whereas most existing studies do not explore neighborhood-level effects. In addition, this study estimates the effect of demographic characteristics that relate to local demand for new establishments, which has largely been ignored in the microgeography literature [52]. Practically, geographically local factors have more direct effects on individual establishments compared to regional factors and, they can be improved by local efforts. Recent empirical studies have found that, for example, the average January temperature and the education level are significant regional factors that affect where new establishments locate. These regional characteristics tend to persist over time and are difficult to change in the short term.

Results and Findings
Two regression models were developed: the establishment model to estimate the number of new establishments (Model 1), and the employment model to estimate the number of jobs created from the new establishments (Model 2). A comparison between the two enables the study to understand the effect of a firm's size on its location choice because of the considerable variation in the size of the new establishments. The models use a two-level linear modeling estimation, as introduced in the methods section. The estimation approach was appropriate for the data because there is a very small number of zeros in the dependent variables. Unlike empirical studies focusing on selected industries with a high percentage of zero values, no establishments in many neighborhoods, and employing zero-inflated and Tobit regressions, this study covers all the industries. Although the dependent variables are normalized by acreage and not integer, generalized linear model specifications, such as Poisson and negative binomial, were also tested to account for count data [53], reporting largely consistent findings for the establishment model, but not appropriate to estimate the employment model and the estimation method used performed better in the model results. The Akaike information criterion (AIC) was used to assess model goodness-of-fit, and the lowest AIC was selected (see Table 5).

Census Tract Level Results
The census tract level results suggest that demographic conditions are an influential driving factor for new establishment creation. In the literature on entrepreneurship and start-up location, the importance of economic factors, such as employment density, a measure to capture the agglomeration economies, is repeatedly emphasized. However, demographic factors are rarely studied at the neighborhood level. Therefore, the causal relationship between demographic conditions and new establishments is examined in greater detail, after first reviewing economic conditions. Finally, the geographic environment variable, namely, the distance to the nearest highway, is discussed.

Economic Environment
Factors explaining the economic environment include employment density, employment within 3 miles, industrial dominance, and population/employment ratio. Two positive effects and two negative effects are estimated for new establishments. The positive coefficients of employment density and population within 3 miles confirm the existence of the cluster effect within each census tract and within 3 miles. These positive cluster effects are consistent with empirical findings in the literature. The negative coefficients include the dominance index and the population/employment ratio, explaining the effect of the industry mix and land use. As these two variables are newly added by this study and not studied previously at the neighborhood level, more attention is allocated to interpreting the meanings of these two variables.
Employment density and employment within 3 miles: As expected, a higher existing employment density (EMP_DENS) and higher employment within 3 miles (RING3M) attract more arrivals of new firms. The RING3M coefficient (0.006) is smaller than employment density (0.015) and confirms the existence and attenuation of the spillover effect of employment density [54]. This indicates that new establishments are affected by the locations of incumbent firms within the census tract and up to 3 miles from the tract. Because the cluster effect is due to geographical proximity, the magnitude of the cluster effect diminishes when the distance between existing firms and newly added start-ups increases. However, this model cannot estimate the scope of the spillover effect, as only a three-mile ring variable is used. When additional distance ring variables are included, the scope of this spillover effect can be measured more precisely [12].
A one-unit increase in employment density (employment per acre) increases the log number of new establishments per square mile by 0.015, that is, a 1.5% increase. The 0.006 RING3M coefficient indicates that one thousand more people were employed within 3 miles in the year 2000, and 0.6% more establishments were created in the year 2001. One example of this is spin-off establishments, which prefer proximity to the parent firm, like in Silicon Valley [55].
Industrial dominance: Quantifying industry mix, the DOMIN variable is higher when one or two industries are dominant at the census tract level and has a negative effect on the number of new establishments. One possible interpretation of this finding is that start-up businesses are likely to locate away from areas occupied by dominant industries. When dominant and vertically related firms co-locate to reduce the cost of obtaining resources or shipping goods, irrelevant new establishments are crowded out and excluded from geographic integration.
This finding reveals an aspect of the industry mix effect at a micro geographical scale. For example, if there is a massive manufacturing company in a certain neighborhood, less relevant new establishments, in terms of industry or supply chain relationship [56], are not likely to locate nearby. Additional variables, such as detailed industry classifications of existing and new establishments, can examine how a link between the dominant industry and supply-chain externalities affects a start-up's location choice [57]. This additional information will also verify the relationship between the dominant industry and the start-up location choice. Another possible reason for the dominant industry's effect on new establishments is because the size of a census tract is small, that is, the number of available sites for new establishments amid large dominant businesses is insufficient. For example, a university campus census tract already crowded with university buildings has little space available for new establishments.
Population/employment ratio: The last variable in the economic environment category, the population/employment ratio, quantifies the level of land uses for residential areas. A higher number indicates that the census tract has more residential than commercial or industrial development. This variable estimates the land use effect on new establishments, as the actual land use or zoning variables are not easily collected at the microscale level. It indicates whether new establishments prefer residential areas for proximity to customers or commercial/industrial areas for a closer relationship with existing businesses. This approach provides a means of comparing the demand and supply sides of new establishments. Therefore, the negative relationship between the population/employment ratio and the location choice reveals that new establishments prefer existing commercial/industrial areas to residential areas. However, as nearby residential population density also has a positive effect, new establishments do not, therefore, ignore residential areas. This finding is further explored in the next section.

Demographic Environment
Regarding the demographic environment, four variables are employed: population density, percent of whites, median age, and percent of vacant housing units. The only variable with a negative effect is the percent of vacant housing units. Studies have explored the importance of the demographic environment as a demand factor of new establishments, such as new services and products. Although the literature supports the significance of the demand factor to attract start-ups in theory, empirical studies testing the existence of the demographic effect at the neighborhood level are very limited. Furthermore, this study is first to introduce the percent of whites, median age, and vacant housing units, whereas previous works have considered only the population density among demographic factors.
Population density: As expected, population density has a positive effect on a new establishment's location choice. A one-unit increase in population density (i.e., people per acre) increases the log number of new establishments per square mile by 0.052, which means a 5.2% increase. This result is consistent with the theoretical and empirical findings in the literature. Most new establishments are small businesses, and if they are community-based operations, such as mom-and-pop retail establishments, their target consumers are likely the population nearby. Therefore, the population density is a consequential predictor of a start-up's location choice. However, many new establishments do not depend on nearby clientele. Percent of whites: This variable is included to estimate the effect that the racial mix has at the neighborhood level on a new establishment's location choice. In addition to the original measurement of the percent of whites, this variable can be a proxy for other correlated factors. For example, income level was originally included in the model but was later excluded due to its high correlation with the percent of whites. Homeownership, which was originally included in the model, is also a highly correlated variable with income and percent of whites. Percent of white was thus solely selected because it improves the two-level model statistically. This variable increases the number of significant variables at the MSA level, as compared with including other correlated variables at the census tract level. This interdependence between the MSA and census tract levels confirms the importance of a multilevel perspective to analyze the location choice of new establishments.
The positive effect of the variable indicates that new establishments tend to favor a neighborhood having a greater number of whites, higher incomes, and more homeownership. The average income in the census tract is likely the most significant factor in attracting new establishments because income is typically a proxy for purchasing power. Schuetz, Kolko, Meltzer [58] focused on the relationship between neighborhood income and retail density at the ZIP code level and found that high-poverty neighborhoods have lower employment density for retail overall. Rather than this study's focus on new establishments, Schuetz, Kolko, Meltzer investigated the neighborhood income effect on the existing establishments and demonstrated how demand for retail goods increases with the average income of the neighborhood.
Median age: Although the median age shows a positive effect on new establishments, determining why age would have a positive effect is complicated. Originally, age was included to test the effect of the age distribution within the metro area on new establishments. The median age can affect new establishments in two ways: as a supply factor and as a demand factor. In a supply factor approach, one possible explanation for the positive effect is that houses are used for office space [59], and older people are more likely to start home-based businesses. Many small-sized establishments with less than five employees locate in single-family residential areas, and small-sized establishments make up a large portion of the entire dataset. This notable finding highlights a new aspect of a start-up's location choice because residential areas are studied mostly for the residence rather than the workplace [60]. As a result, business location theory does not examine residential areas as a possible site for opening up businesses because of the traditional perspective of "Home/Work Separation" dichotomy [61]. In addition, new businesses are likely initiated by experienced people who have savings and knowledge rather than younger people with less experience. As a demand factor approach, older people are likely to be better clients for new establishments, because they are more financially established than their younger counterparts. The older age group's expenses for basic needs, such as a home mortgage, are already paid off, and they are more available for new services offered by new establishments.
Percent of vacant housing units: The last variable in the demographic environment category is the percent of vacant housing units. As this coefficient is significantly negative, it is consistent with an interpretation that new establishments tend to avoid declining neighborhoods for their business locations. A high proportion of vacant housing units is a key indicator of neighborhood decline [62]. Since new businesses are established to satisfy increased or expected demand, declining neighborhoods are not attractive locations to start a new business. Furthermore, this variable represents the physical conditions that generally affect the image of the neighborhood. Although the image of the neighborhood is not quantifiable, it typically is a significant factor that affects new firm formations. Therefore, vacancy rates are generally negatively correlated with the image of a neighborhood and can hence be a proxy for the image.

Geographic Environment
The geographic environment is generally the most important and noteworthy factor for any location choice, and new establishments are no exception. The main focus of the geographic environment is the accessibility to new establishments [63]; therefore, to quantify overall accessibility, the distance to the nearest highway was employed. Another variable of accessibility, the distance to the downtown of the central city, was also considered but was highly correlated with the distance to the nearest highway. For medium-sized metropolitan areas in the U.S., the density of highways near downtown areas is high. Hence, the two variables were highly correlated.
As expected, the coefficient on distance to the nearest highway is negative and significant at the 1% level; thus, highway accessibility is an important factor for location choice of a new establishment. This phenomenon can be interpreted in two ways. First, the importance of physical interaction with clientele still exists despite the wide availability of internet-based transactions. Second, face-to-face contact with co-workers and other related businesses is necessary for start-up businesses [64]. That condition is especially true in the beginning stage of new establishments, when interactions with other related businesses are frequent and essential to learn best practices of other successful businesses.

MSA-Level Results
By including the MSA-level variables, the two-level model, in the framework of the multilevel regression, estimates weighted random coefficients of regional attributes to control regional variances. This is one advantage of the multilevel approach compared with general regressions that use regional dummies to control regional variance. As a result, the MSA-level factors are not just control variables in the model; rather, the estimated coefficients reveal the regional business environment for start-up activities. However, 27 regional observations are not large enough to estimate regional attributes in a statistically significant manner. Because the regional observations are limited to 27 cases, the overall level of significance at the MSA level is lower than the level at the census tract level.
At the MSA level, the four categories of variables are agglomeration economies, business climate, amenities, and regional demographics. Among the nine explanatory variables, MSA employment density, average commuting minutes, and January temperature are significant at the 1% levels; three business climate variables are significant at the 5% levels, and the percent of college degrees 25 plus is significant at the 10% level.
Agglomeration economies: Agglomeration economies variables are used to estimate the regional clustering effect. As expected, MSA employment density has a positive coefficient. Positive coefficients at both MSA and census tract levels confirm that a more compact development pattern, in terms of employment distribution, is conducive to attracting new establishments locally and regionally. The second statistically significant variable in the agglomeration economies is average commuting minutes. The longer the commuting times, the less likely a new establishment is to locate in the MSA. This variable can be considered as an indicator of the overall traffic conditions. Better transportation conditions can facilitate the movement of people and more opportunities for face-to-face meetings within an MSA, for example, between the downtown and suburban employment centers. The results suggest that new establishments tend to avoid environments with longer commuting times and congested traffic conditions.
Business climate: Among the four business climate variables, only technology, competition, and environmental policy show significant coefficients at the 5% level. However, finance is not statistically significant, although financing initial investments is one barrier for every start-up to overcome. Although originally expected to have a positive coefficient, the technology variable has a negative effect. In general, a higher level of technology within a region is regarded as a better condition to stimulate entrepreneurial activities [65]. However, not every establishment is entrepreneurial or innovative. Among the data sets, the most frequent industrial sector is administrative and support and waste management and remediation services, North American Industry Classification System (NAICS) 56, accounting for 22.2% of the entire new establishments. Technology is not always essential to start a new business of this type. Conversely, the technology-based industry of Professional, Scientific, and Technical Services (NAICS 54) accounts for 13.2%. Thus, the level of technology is not a necessary condition for most new establishments, especially if that establishment is not based on advanced new technologies. One explanation for the expected positive sign is that most start-up studies focus on new industries, such as computer and IT technologies. However, most new establishments are not start-ups. This focused industry analysis overestimates the importance of technology to start a new business.
Competition and Environmental Policy: The number of new establishments increases in a competitive business climate, which is consistent with the findings of empirical studies that argue competition between firms is necessary to nurture start-up activities. Likewise, this association also occurs the other way around, in which new establishments increase competition and, therefore, overall enhanced competitiveness is anticipated. Environmental policies have a negative influence on new establishments. However, most new establishments are not in pollution-intensive industries. Only 3.3 percent of the start-ups in 2001 were from the manufacturing sector (NAICS 31-34). If the environmental policy indicates the degree of the overall public sector regulations, this negative effect is likely relevant for other industries. In general, tighter rules reduce the number of new establishments.
The number of new establishments increases with the average January temperature, which is a proxy to measure regional natural amenities. This finding exemplifies the significance of the warm winter effect on new establishments, considering the January temperature is one of the strongest predictors of regional growth in empirical studies in the U.S [66]. As a result, a positive relationship between new establishments and regional growth within 27 MSAs is expected, although the relationship between two is not the main focus of this study.
In the final category at the MSA level of demographics, new establishments are positively associated with the MSA education level. This result may indicate that highly educated people tend to be more entrepreneurial than those who are less educated. In addition to the entrepreneurs' perspective, the education level could characterize an aspect of labor pooling. When the education level indicates the level of skills in line with research by Florida, Mellander, Stolarick [67] and Glaeser and Resseger [68], start-up businesses prefer well-trained workers to novices. New establishments attempt to reduce risks by hiring skilled workers. Since the education level is also associated with income level, higher incomes will increase the number of new establishments because of the increase in purchasing power for novel goods from new establishments. However, the income level on the demand side is already controlled by a GDP per capita variable, although the GDP per capita is not statistically significant in the model.

Conclusions
This research aims to identify the spatial determinants of new establishment location decisions in the U.S. on two geographic levels by examining the regional and neighborhood influences. To observe the effects of the two levels, the MSA-level variables and census tract level variables are derived by accounting for theoretical debates and prior empirical findings.
An examination of the neighborhood-level determinants is a primary contribution of this study, as most new establishments are microenterprises with a small number of employees targeting local markets. Since local markets are associated with neighborhood conditions, neighborhood factors are more influential for local businesses than regional factors. Likewise, regional factors affect new establishments, albeit indirectly. Existing employment and population density are two major neighborhood-level determinants positively influencing new establishments. The employment density effect has approximately twice as much of an impact as the population density effect in predicting the number of new establishments. This study empirically tests the existence of the population density effect on new establishments at the neighborhood scale. Higher rates of vacant housing units, an indicator of a declining neighborhood, are associated with fewer establishments in a neighborhood, although the effect of housing vacancy is the smallest within the demographic category. This provides a better understanding of the level of vulnerability that the local economy has when struggling to retain population and lure in new residents. In addition, industrial dominance is negatively associated with new establishments that may be industries outside of those dominated in a geographic area. When dominant and vertically related firms are co-located, new establishments irrelevant to the dominant firms have been crowded out and excluded from the geographic integration. This result is inconsistent with the cluster strategy that encourages co-location and spatial proximity among related industries, despite the geographic scale of the industry cluster being generally larger than a census tract. Therefore, this finding can initiate a debate on what is the desirable role of industrial relatedness at the local level in an effort to provide greater capacity for regional resilience [69].
The results of the MSA-level analysis, including the business climate and amenities, are mostly consistent with the empirical findings in the literature. Among the MSA-level variables, the average January temperature has the most significant positive effect on new establishment location choice. This confirms the well-known finding of the presence of the warm winter effect: warmer regions grow quickly. One notable result is that the impacts of agglomeration economies are less dominant, with the exception of the negative effect of commuting minutes, than previous empirical findings. For medium-sized metro areas, it may be that the effect of urbanization economies is smaller than in larger metro areas since the population and market size of a certain region determines the extent of its urbanization economies. At the same time, it is important to note that medium-sized regions may need a distinct approach to new business creation that the existing literature mostly on megaregion cases does not address.
Although this research provides a portrait of the neighborhood business geography and the model predicts and explains factors that affect a start-up's location choice, several issues remain unclear.
This obscurity fundamentally comes from the assumption that new establishments are homogeneous despite each being unique. The four types of new establishments have been introduced in the study, but additional diverse types of new establishments exist. Future research should include modeling location decisions by incorporating attributes such as industry and size of firms. In addition, many unobservable and immeasurable factors are not included in the model. For example, the motives and the behaviors of small enterprises depend partly on the personal preference of founders [70]. Although this study considers the business climate at the regional level, the model does not assess the impact of local regulations and incentives, such as zoning and tax districts for firm creation, which are not comparable across regions and neighborhoods to be used for variables in a single model. Because policymaking and budget allotments to implement such initiatives fall largely under the jurisdiction of local governments, future work focusing on individual municipalities could combine the two-level framework of this research with local data on place-based programs and location-specific subsidies.

Conflicts of Interest:
The author declares no conflict of interest.