Highway Freight Transportation Diversity of Cities Based on Radiation Models

Using a unique data set containing about 15.06 million truck transportation records in five months, we investigate the highway freight transportation diversity of 338 Chinese cities based on the truck transportation probability pij from one city to another. The transportation probabilities are calculated from the radiation model based on the geographic distance and its cost-based version based on the driving distance as the proxy of cost. For each model, we consider both the population and the gross domestic product (GDP), and find quantitatively very similar results. We find that the transportation probabilities have nice power-law tails with the tail exponents close to 0.5 for all the models. The two transportation probabilities in each model fall around the diagonal pij=pji but are often not the same. In addition, the corresponding transportation probabilities calculated from the raw radiation model and the cost-based radiation model also fluctuate around the diagonal pijgeo=pijcost. We calculate four sets of highway truck transportation diversity according to the four sets of transportation probabilities that are found to be close to each other for each city pair. It is found that the population, the gross domestic product, the in-flux, and the out-flux scale as power laws with respect to the transportation diversity in the raw and cost-based radiation models. It implies that a more developed city usually has higher diversity in highway truck transportation, which reflects the fact that a more developed city usually has a more diverse economic structure.


Introduction
The growing volumes of passenger and freight transport around regionally and globally witness their important role for economic development of different countries [1][2][3][4][5]. Aviation, railway, highway and shipping are four main transportation methods in modern societies. Unlike other three ones, information about highway transportation is less publicly available. In mainland China, the highway system has experienced a very rapid development since the Reform and Opening-up of China, forming a rapidly expanding multiplex network which contains national highways, provincial highways, county highways and countryside highways [6]. China has the longest expressway network in the world, which includes about 0.143 million kilometers expressways.
In the past decades, the gravity law is the most adopted in understanding transportation networks and predicting transportation fluxes [7][8][9][10][11], which reads where W ij is the flow between locations i and j, M i (or M j ) is usually the population or gross domestic product (GDP) of location i (or j), d ij is the distance between i and j, and α, β and γ are the model parameters. Very relevantly, the gravity law has been investigated and confirmed in the Korean highway network between 30 largest cities [7], the express bus flow in Korea consisting of 74 cities and 170 bus routes with 6692 operating buses per day [12], and the urban bus networks of Korean cities [13], and the highway freight transportation networks of 338 Chinese cities [6]. However, the gravity model has several limitations, especially the requirement of previous traffic data to fit the parameters [14]. To overcome those limitations, the radiation model has been proposed [14], in which the predicted fluxF ij from city i to city j is obtained as followsF where S ij is the total "mass" (population or GDP) in the circle of radius d ij centered at i but excluding the source and destination population, and F out i is total out-flux departing from where F ij is the real flux from i to j. Obviously, the data of F out i are much easier to collect than F ij .
In the raw radiation model, d ij is the geographic distance between i and j. The costbased radiation model has been soon proposed based on the intuition that an individual will choose the site that has the lowest travel cost on the network, where the travel cost can be measured by the path length or travel time from i to j [15]. In this work, d ij is measure by the path length or driving distance from i to j. Later, to better estimate the fluxes at different spatial scales, a scaling parameter is introduced into the radiation model [16]. By combining memory effect and population-induced competition, a general model has been developed to enable accurate prediction of human mobility based on population distribution only, which also has a parameter qualifying the memory effect [17].
Although the radiation model has been adopted in the study of trip distributions [9,[18][19][20][21], applications to freight transportation are rare. In this work, using a unique data set about the highway freight transportation by trucks between 338 cities in mainland China, we investigate the transportation probability p ij between two cities i and j and the transportation diversity of a city calculated from p ij . Although most studies dealt with undirected transportation networks [6,22,23], radiation models enable us to consider directed transportation networks due to the availability of data [24]. The raw radiation model and the cost-based radiation model are adopted because they are parameter free.
It has been reported that higher social network diversity provides greater access to social and economic opportunities and has a strong correlation with the economic development [25]. With the highway freight transportation data between Chinese cities available, we aim to investigate the relationship between highway freight transportation network diversity and economic development of cities. Such an analysis has not been conducted due to the difficulty in obtaining the highway freight transportation data. Our analysis shows that the population, the gross domestic product, the in-flux, and the out-flux scale as power laws with respect to the transportation diversity in the raw and cost-based radiation models, which implies that a more developed city usually has higher diversity in highway truck transportation. This finding reflects the fact that a more developed city usually has a more diverse economic structure.
The remainder of this work is organized as follows. Section 2 describes the data sets we analyze. Section 3 studies the basic properties of transportation probability. Section 4 deals with the transportation diversity of cities and their relationship with population and GDP. We discuss and summarize in Section 5.

Data Sets
The data set we analyze was provided by a leading truck logistics company in China, which records the highway truck freight transportation between 338 cities in mainland China over the period from 1 January 2019 to 31 May 2019 [6]. The data cleaning was done by the company, who used the data set in their truck scheduling and route planning. There are about 15.06 million truck freight transportation records in total, each entry containing the origin and destination cities and the starting date of the transportation. We can construct the flux matrix F = F ij 338×338 , where F ij stands for the number of trucks with freights driven from city i to city j. Unloaded trucks are not counted in. Because radiation models do not consider intra-city transportation, we set that It is obvious that F ij is not necessary to be equal to F ji for i = j.
The GDP and population data for the 338 Chinese cities in 2017 were retrieved online from the Complete Collection of World Population (http://www.chamiji.com, accessed on 18 May 2021), which are publicly available except for a few cities. We supplemented the missing data by searching Baidu Encyclopedias (https://baike.baidu.com, accessed on 18 May 2021).
The geographic distance d geo ij is the shortest surface distance between two cities located by the longitude and latitude, which is the length of the great circle arc connecting two points on the surface of the earth. The longitude and latitude of each city can be easily obtained online for free. The data set of the driving distances d cost ij between pairs of cities was provided by the same truck logistics company, which were collected by their truck drivers. The driving distance between two cities are usually "optimized" by the truck drivers because they always have the motivation to find a path connecting the two cities with the least cost (time and money). Such an optimization is achieved either by their own experience or by information from buddy truck drivers they trust. It is obvious that for all pairs of cities. The difference between these two distances increases when the two cities are farther away to each other. By definition, the geographic distance matrix is symmetric, that is, d In contrast, the driving distance matrix is asymmetric, i.e., which is mainly due to the fact that, besides highways, there are often local roads that a truck driver has to take from one city to the other.

Formulae
According to the radiation models (2) we adopt, the transportation probability p ij from city i to city j is When we choose population P for M, the transportation probability becomes where S ij is the total population in the circle of radius d ij centered at i but excluding the source and destination population. Alternatively, when we use GDP as the proxy, we have where S ij is the total GDP in the circle of radius d ij centered at i but excluding the source and destination population. The transportation probabilities p ij of the raw radiation model using geographic distance and the cost-based radiation model using driving distance are calculated with respect to population P in Equation (9) and gross domestic product G in Equation (10). Figure 1 illustrates the four empirical distributions of the transportation probability p ij between two cities for the two radiation models with M = P and M = G, respectively. We observe a nice power-law tail in each case and the exponents are the same for the four cases: where the tail exponents α ≈ 0.5 and the intercepts are almost the same. The power-law relationship holds over three orders of magnitude. The smallest transportation probabilities deviate from the power-law distributions with higher probability density. Theoretically, we know that two cities with longer distance usually have a smaller transportation probability. Indeed, it we plot p ij with respect to d ij , we find that the points fluctuate around a powerlaw scaling with an exponent of −4:

Power-Law Distribution of p ij
which corresponds to the case of uniform population (or GDP) density [14]. The standard deviation of the data points from this reference power law quantifies the strength of heterogeneity of the spatial distribution of population and GDP in mainland China.

Asymmetric Relationship between p ij and p ji
We illustrate in Figure 2 the asymmetric relationship between p ij and p ji for the two radiation models using population. The results for GDP is very similar for each model. It is striking that the predicted values of transportation probability span nine orders of magnitude. We also find that the scatter points lies close to the diagonal p ij = p ji . The points from the cost-based model in Figure 2b concentrate more to the diagonal than the points in Figure 2a and thus the transportation probability matrix {p ij } is less asymmetric. The two dashed lines impose a restriction on the transportation probability values, requiring that which is more visible if we use linear coordinates. This restriction can be derived as follows. According to Equation (9), the probability of transportation from city j to city i is For two given cities i and j, it is easy to notice that p ij and p ji reach their maxima when the two cities are adjacent, that is S ij = S ji = 0.
In this case, we have p ij = P j P i + P j (16) and The restriction shown in Equation (13) is thus obtained. This argument holds for both of the radiation models, because the derivation is independent of the definition of the distance between two cities. It also applies to the two models based on GDP, as expressed in Equation (10).

Comparison between p
geo ij and p cost ij We compare the predicted transportation probabilities from the two models. The results are shown in Figure 3. We find that the points fluctuate around the diagonal line The insets show that there are many points that fall exactly on the diagonal. These points correspond to the situations when Usually, this condition (19) is more likely to be fulfilled when the two cities i and j are close. As a special case, when city j is the closest city of city i, we have S

Transportation Diversity
We now define the transportation diversity of a city i based on its transportation probability p ij as follows where p ij can be calculated from the two radiation models using either population P or gross domestic product G. We calculate four sets of diversity D M,d i , where M = P or M = G and d = d geo or d = d cost . Indeed, human mobility or communication diversity has been proposed and studied [25][26][27].

Comparison of Diversity Based on Population and Gross Domestic Product
In Figure 4, we compare six pairs of any two diversity sets obtained. The two plots in the top row show the influence of distance on diversity for fixed choice of M, while the two plots in the bottom row illustrate the influence of the choice of M on diversity in a given model. We find that, in each plot, there is a nice linear relationship: It is found that the influence is weaker for the choice of model than for the choice of M.

Dependence of City Traits on Diversity
We further check the dependence of city traits (P, G, F out , or F in ) on the truck transportation diversity D i , where F in i is total in-flux arriving at city i The results are depicted in Figure 5. In the four plots of Figure 5e-h for D P,cost i , we observe two outliers that seem isolated from other points. These outliers correspond to two same cities, Shennongjia Forestry District and Ali District. The diversities of these two cities are respectively 0.1496 and 0.1529. We observe power-law dependence in each plot. We can write that where Y represents P, G, F out or F in , M stands for population P or gross domestic product G in the radiation model, and d determines the geographic or driving distance. The powerlaw exponents β(Y, M, d) are estimated with the ordinary least-squares regression, which are presented in Table 1. For a given city trait and the chosen M, the two power-law exponents are similar in the raw radiation model and the cost-based radiation model.
In contrast, the power-law exponent is larger when we use population P as M in the radiation models.

Discussion and Conclusions
In this work, we investigated the highway freight transportation diversity of 338 Chinese cities based on the transportation probability p ij from one city to the other. The transportation probabilities are calculated from the raw radiation model based on geographic distance and the cost-based radiation model based on driving distance as the proxy of cost.
We found that, in either the raw radiation model or the cost-based radiation model, the results obtained with the population and the gross domestic product are quantitatively similar. It is mainly due to the nice power-law scaling between population and GDP of Chinese cities, where the power-law scaling exponent is estimated to be 1.15 ± 0.08 [6,28].
We investigated several important properties of the truck transportation probability p ij . It is found that the transportation probabilities are distributed broadly with a nice power-law tail and the tail exponents are close to 0.5 for the four models. It is also found that the transportation probability matrix in each model is asymmetric such that p ij does not necessary equal to p ji , which is consistent with our intuition.
We also found that the population, the gross domestic product, the in-flux, and the out-flux scale as power laws with respect to the transportation diversity in the raw radiation model and the cost-based radiation model. It is intuitive that a city with higher GDP (often with larger population) usually has higher diversity in its industrial structure. These cities usually have higher diversity in highway freight transportation.
The strong correlation between transportation diversity and economic development implies a strong association between industry diversity and economic development. Although a causal direction of this relationship cannot be established through our analysis, transportation diversity at least provides a structural signal for the economic development of a city, highlighting the potential benefit of industry-targeted policies for economic development. Further research is required to obtain reliable policy implications. In particular, longitudinal data sets for transportation networks and economic development are required to establish a possible causal relationship.

Data Availability Statement:
We signed a confidentiality agreement with the transportation company who provided us the data used in this work. Hence the data will not be shared.

Conflicts of Interest:
The authors declare no conflict of interest.