Global Container Port Network Linkages and Topology in 2021

The maritime transport of containers between ports accounts for the bulk of global trade by weight and value. Transport impedance among ports through transit times and port infrastructures can, however, impact accessibility, trade performance, and the attractiveness of ports. Assessments of the transit routes between ports based on performance and attractiveness criteria can provide a topological liner shipping network that quantifies the performance profile of ports. Here, we constructed a directed global liner shipping network (GLSN) of the top six liner shipping companies between the ports of Africa, Asia, North/South America, Europe, and Oceania. Network linkages and community groupings were quantified through a container port accessibility evaluation model, which quantified the performance of the port using betweenness centrality, the transport impedance among ports with the transit time, and the performance of ports using the Port Liner Shipping Connectivity Index. The in-degree and out-degree of the GLSN conformed to the power-law distribution, respectively, and their R-square fitting accuracy was greater than 0.96. The community partition illustrated an obvious consistence with the actual trading flow. The accessibility evaluation result showed that the ports in Asia and Europe had a higher accessibility than those of other regions. Most of the top 30 ports with the highest accessibility are Asian (17) and European (10) ports. Singapore, Port Klang, and Rotterdam have the highest accessibility. Our research may be helpful for further studies such as species invasion and the planning of ports.


Introduction
Over 80% of international trade volume, accounting for 70% of its trade value, is carried by ships and handled by seaports around the world [1]. Maritime transport is regarded as the backbone of global trade and the lifeblood of the global economy. The operational performance of ports and the links between ports together form the maritime transport network, which has an important influence on global maritime trade [2]. Thus, research on maritime transport networks as well as port connectivity and accessibility has received increasing attention in recent years [3,4].
Traditional shipping studies relied upon indicators such as the GDP or freight index to analyze trade volume and the value between countries [5][6][7]. In 2004, the United Nations Conference on Trade and Development (UNCTAD) proposed the Liner Shipping Connectivity Index (LSCI) for shipping connectivity [8] and later extended the LSCI to the Port-LSCI (PLSCI) in 2019 to quantify the efficiency of a port for handling ships and cargo. Many researchers have, therefore, started to use the PLSCI for port competitiveness assessments. For example, Tovar and Wall [9] used the PLSCI to analyze the relationship between port connectivity and port productivity, and conducted case studies in 16 Spanish ports. Their results illustrated a strong positive correlation between the connectivity and efficiency of ports. In addition to the PLSCI, researchers have developed a variety of port

Data
The data used in this article included the service routes of the liner companies, the PLSCI, and the basic attributes of the port such as the country code, port latitude, and port longitude, as shown in Table 1. Table 1. Attributes merged and used in this study.

Attribute Name Description
Port Name Port names matched among routes from 6 liner shipping companies, PLSCI data from the UNCTAD, and port data from the IHS market PLSCI Published by the UNCTAD quarterly 2.1.1. Shipping Lines We collected the service schedules published on the websites of the top six liner shipping operators (about 71.3% of the global TEUs in 2021, according to Alphaliner.com, as shown in Table 2) from 13 July to 26 July 2021. The route data included the departure port, arrival port, and transit time between ports. The basic attributes of the ports (e.g., the coordinates and country code), bought from IHS Markit, were matched with the name and location of the ports. The GLSN included 564 unique ports (nodes) and 9474 routes (repeated routes of different companies were merged into 2971 directed links), as shown in Figure 1.

Data
The data used in this article included the service routes of the liner companies, the PLSCI, and the basic attributes of the port such as the country code, port latitude, and port longitude, as shown in Table 1.  Table 2) from 13 July to 26 July 2021. The route data included the departure port, arrival port, and transit time between ports. The basic attributes of the ports (e.g., the coordinates and country code), bought from IHS Markit, were matched with the name and location of the ports. The GLSN included 564 unique ports (nodes) and 9474 routes (repeated routes of different companies were merged into 2971 directed links), as shown in Figure 1.

PLSCI
The Liner Shipping Connectivity Index (LSCI), published by the UNCTAD annually since 2004, evaluates the degree of integration for countries connected to the global liner shipping network. It was further improved by the Port-LSCI (PLSCI), which covers more than 900 container ports around the world and is updated quarterly. We collected the PLSCI data from GLSN ports in the second quarter of 2021 from the UNCTAD website.

Methods
The following methods were applied to determine the network linkages and topology of the port connections in 2021. The complex network analysis methods used included the average degree, average clustering coefficient, average shortest path length, and betweenness; these were applied to the GLSN. The Leiden community detection algorithm was used to discover the trade topology structure of the GLSN and the modularity of the partition result was calculated. Finally, an accessibility evaluation model was proposed.

Complex Network Analysis Factors
In this section, definitions are provided for the factors used in the GLSN, including the average degree, average clustering coefficient, average shortest path length, and betweenness.
Degree and average degree: The degree of a node is the number of links adjacent to the node. The in-degree and out-degree are considered separately in directed networks. The average degree k for a directed network is defined as follows: where N represents the total number of nodes in the network; k in i represents the number of links that point into node i; k out i represents the number of links that point out from node i to other nodes; and L represents the total number of links (regardless of direction) in the network.
Average shortest path length: The average shortest path length (ASPL) is defined as the average number of steps along the shortest paths for all possible pairs of network nodes. It is a measure of the efficiency of information or mass transport on a network.
Consider a network G with the set of vertices V; dist(v 1 , v 2 ) denotes the shortest path between v 1 and v 2 (v 1 , v 2 ∈ V). If dist(v 1 , v 2 ) = 0, then has_path(v 1 , v 2 ) = 0 if there is no path between v 1 and v 2 or v 1 = v 2 . However, if has_path(v 1 , v 2 ) = 1 then there is a path from v 1 to v 2 . The ASPL for network G can then be defined as: where, ∑ N i,j dist v i , v j represents the sum of all shortest path lengths and ∑ N i,j has_path(v 1 , v 2 ) represents the total number of paths [32].
Average clustering coefficient: The degree of clustering of a whole network is captured by the average clustering coefficient C, representing the average of c v over all nodes in the network: The clustering coefficient for each node in the directed network is calculated as follows, according to [18]: where k tot v is the sum of the in-degree and out-degree of the node v in the network, T v is the number of directed triangles passing through node v, and k ↔ v is the reciprocal degree of v. Betweenness centrality: Betweenness centrality is an indicator measuring the influence of the nodes based on the shortest path. The betweenness centrality of node v is the sum of the fraction of the shortest paths of all pairs that pass through v: where σ(s, t) is the number of shortest paths between node s and node t and σ(s, t|v) is the number of those paths passing through a node v, other than s or t. If Betweenness centrality can be normalized for directed networks as: Community detection: The Leiden algorithm package for Python [33] was used for the shipping community detection. The Leiden algorithm [34], which extends the Louvain algorithm [35], is widely regarded as one of the best algorithms for detecting communities. The frequency of occurrence for each route can be used as the weight of the route whilst detecting the communities.
Modularity: Modularity is a measurement for the partitioning of the network into communities. A higher modularity value indicates a stronger internal connection or cohesiveness within a community. In practice, a value between 0.3~0.7 is considered to be a good indicator of a significant cohesive community structure in networks [35][36][37]. Modularity can be expressed as: where A is the adjacency matrix for the network and k i is the total degree of node i. Specifically, if i and j are in the same community, then σ c i , c j = 1; otherwise, it is 0.

Accessibility Evaluation Model
Combining the PLSCI, betweenness centrality for ports, and transit time between ports, the accessibility for port liner shipping transportation is defined as: where A f rom_i and A to_i are the accessibility from/to port i, respectively; m and n are the total number of routes from/to port i, respectively; T ij and T ji are the transit times from port i/j to port j/i, respectively; C i and C j are the PLSCIs of port i/j, respectively; B i and B j are the betweenness centralities of port i/j, respectively; and O ij and O ji are the occurrence times of link ij/ji, respectively.

Results
Based on the collected data, we constructed a directed GLSN, as shown in Figure 2. The route attributes included the average transit time between nodes and the frequency of route appearances. The port attributes included the country that the port belonged to and the corresponding PLSCI of the port. of route appearances. The port attributes included the country that the port belonged to and the corresponding PLSCI of the port. Figure 2. The GLSN. The size of a node is based on its degree; we set the size of nodes with a degree between 1 and 31 as the minimum and the size of nodes with a degree of 156 (Singapore) as the maximum. The color of the nodes in the diagram varies depending on the continent to which they belong. The continent classification is provided by IHS Markit.

Topological Characteristics of the GLSN
The average degree of the GLSN was 5.27, the average clustering coefficient was 0.33, and the average shortest path length was 4.10. The degree distribution of the GLSN, as shown in Figure 3, demonstrated that most ports had few shipping routes. However, there were several important ports such as Singapore port that had a considerable number of routes from/to different ports. A relatively high average degree and a small average shortest path length indicated that the GLSN conformed to the characteristics of a small-world network, which was consistent with the findings of previous works [13,38,39].
Previous studies [19,[40][41][42] concluded that the GLSN should be a scale-free network and that the in-degree and out-degree of a directed network should conform to the powerlaw distribution, respectively [43]. We tested the power-law fitting for the in-degree and out-degree distribution of the GLSN in log-scaled axes (Figure 4 a, b); both R-square values were larger than 0.96. It can be seen from Figure 4 c, d that the residuals of the powerlaw distribution fitting were higher at the small-degree nodes, but became smaller as the degree increased. The fitting result showed that the GLSN was a scale-free network for the in-degree and out-degree. That is to say, a few nodes in the GLSN had a greater number of links and these nodes were called hubs. Hubs typically had links from the entire network serving as links between different parts of the network, thus showing a smallworld property. For example, the Singapore port, mentioned above, had the highest degree (including the out-degree and in-degree, respectively); it is an important hub port for East Asian and European trade. Figure 2. The GLSN. The size of a node is based on its degree; we set the size of nodes with a degree between 1 and 31 as the minimum and the size of nodes with a degree of 156 (Singapore) as the maximum. The color of the nodes in the diagram varies depending on the continent to which they belong. The continent classification is provided by IHS Markit.

Topological Characteristics of the GLSN
The average degree of the GLSN was 5.27, the average clustering coefficient was 0.33, and the average shortest path length was 4.10. The degree distribution of the GLSN, as shown in Figure 3, demonstrated that most ports had few shipping routes. However, there were several important ports such as Singapore port that had a considerable number of routes from/to different ports. A relatively high average degree and a small average shortest path length indicated that the GLSN conformed to the characteristics of a smallworld network, which was consistent with the findings of previous works [13,38,39].
Previous studies [19,[40][41][42] concluded that the GLSN should be a scale-free network and that the in-degree and out-degree of a directed network should conform to the powerlaw distribution, respectively [43]. We tested the power-law fitting for the in-degree and out-degree distribution of the GLSN in log-scaled axes (Figure 4a,b); both R-square values were larger than 0.96. It can be seen from Figure 4c,d that the residuals of the powerlaw distribution fitting were higher at the small-degree nodes, but became smaller as the degree increased. The fitting result showed that the GLSN was a scale-free network for the in-degree and out-degree. That is to say, a few nodes in the GLSN had a greater number of links and these nodes were called hubs. Hubs typically had links from the entire network serving as links between different parts of the network, thus showing a small-world property. For example, the Singapore port, mentioned above, had the highest degree (including the out-degree and in-degree, respectively); it is an important hub port for East Asian and European trade.    Singapore or other hub ports are transit points for world maritime trade where goods are distributed. The in-degree and out-degree of most hub ports are similar such as Singapore (in-degree 78; out-degree 78), Busan (46, 44), and Rotterdam (43,41). However, several ports also have large differences in their in-degree and out-degree. As can be seen from Table 3, hub ports such as Tanger Med and Algeciras have an obvious difference in their in-degree and out-degree. The same situation occurs in other ports such as Sydney, Veracruz, and Tianjin. The ranking for the betweenness of ports in the GLSN is partly shown in Table 4. Singapore port had the highest betweenness centrality, reaching 0.26 after normalization, followed by Rotterdam port at 0.13 and Busan port at 0.11. The ports with a high betweenness centrality belonged to a wide range of countries, but they were mainly distributed in Asia and Europe. As seen in Figure 5, Asian and European ports had a higher average betweenness than the other continents. However, the difference of betweenness among continents was not highly significant.  The Leiden algorithm was used to determine the community division using three million randomly simulated divisions. Modularity was used to evaluate the result of each division. The partitioning based on the 3 million divisions divided the 557 ports into 10 communities, as shown in Figure 6. The largest modularity value observed was 0.6433.   The Leiden algorithm was used to determine the community division using three million randomly simulated divisions. Modularity was used to evaluate the result of each division. The partitioning based on the 3 million divisions divided the 557 ports into 10 communities, as shown in Figure 6. The largest modularity value observed was 0.6433.  The Leiden algorithm was used to determine the community division using three million randomly simulated divisions. Modularity was used to evaluate the result of each division. The partitioning based on the 3 million divisions divided the 557 ports into 10 communities, as shown in Figure 6. The largest modularity value observed was 0.6433.  Figure 6 shows the global spatial distribution of the ports belonging to each of the 10 communities described in Table 5 as communities C1 to C10. The number of ports in the top 6 communities accounted for 82.45% of the total number of ports. C5 had the highest average degree at 6.00, the second highest average clustering coefficient at 0.44, and the third shortest path at 2.54; it performed the best in the 3 indicators among the top 6 biggest communities.  Figure 6 shows the global spatial distribution of the ports belonging to each of the 10 communities described in Table 5 as communities C1 to C10. The number of ports in the top 6 communities accounted for 82.45% of the total number of ports. C5 had the highest average degree at 6.00, the second highest average clustering coefficient at 0.44, and the third shortest path at 2.54; it performed the best in the 3 indicators among the top 6 biggest communities. Table 5. Description and indicators of the 10 communities determined from the GLSN analysis.

C1
The largest community was mainly distributed in Europe and north-west Africa and included 105 ports from 31 countries Greatest betweenness centrality ports included Rotterdam, Algeciras, and Tanger Med. The average degree in this community was 4.24, average clustering coefficient was 0.34, and average shortest path length was 3.12

C2
The second community was mainly located in south-east Asia, west Asia, and north-east Africa and included 88 ports from 33 countries Greatest betweenness centrality ports included Singapore, Tanjung Pelepas, and Jebel Ali. The average degree in this community was 4.32, average clustering coefficient was 0.44, and average shortest path length was 2.83

C3
The third community was mainly located on the east coast of North America and Central America and consisted of 87 ports from 39 countries Major ports included Manzanillo (Panama), Cartagena (Colombia), and New York. The average degree in this community was 3.49, average clustering coefficient was 0.31, and average shortest path length was 3.44

C4
The fourth community was scattered around the Mediterranean Sea and consisted of 80 ports from 24 countries Major ports included Piraeus, Marsaxlokk, and Valencia. The average degree in this community was 4.94, average clustering coefficient was 0.31, and average shortest path length was 2.89

C5
Ports of the fifth community were mainly distributed in east Asia and the west coast of North America and consisted of 65 ports from 8 countries Major ports included Busan, Shanghai, Hong Kong, and Los Angeles. The average degree in this community was 6.00, average clustering coefficient was 0.44, and average shortest path length was 2.54

C6
Ports of the sixth community were mainly scattered around the south coast of Africa and consisted of 40 ports from 18 countries Major ports included Durban and Pointe Noire. The average degree in this community was 3.00, average clustering coefficient was 0.33, and average shortest path length was 3.68

C7
The seventh community consisted of 30 ports from 12 countries from Oceania Major ports included Auckland and Brisbane. The average degree in this community was 2.57, average clustering coefficient was 0.30, and average shortest path length was 3.73

C8
The eighth community consisted of 27 ports from 11 countries from the west coast of South America and Central America Major ports included Balboa and Callao. The average degree in this community was 4.37, average clustering coefficient was 0.47, and average shortest path length was 2.34

C9
The ninth community consisted of 22 ports from 3 countries distributed on the east coast of South America Major ports included Santos and Paranagua. The average degree in this community was 3.64, average clustering coefficient was 0.36, and average shortest path length was 2.30

C10
The tenth community consisted of 13 ports from Norway Major ports included Haugesund and Aalesund. The average degree in this community was 1.69, average clustering coefficient was 0.29, and average shortest path length was 2.88

Accessibility of Ports in the GLSN
We obtained the PLSCI of global ports and matched the PLSCI in Q2 of 2021 (consistent with the period of obtaining the route data) with the ports in the GLSN for the accessibility calculations. Table 6 shows the top 30 ports with the highest PLSCI.  Almost all of the top 30 ports with the highest PLSCI were European and Asian ports, and 11 of the ports were in China (including Hong Kong, Macao, and Taiwan). As for the PLSCI of all ports, Figure 7a shows that the North American ports had the second highest average as well as the highest median PLSCI. The PLSCI of the Asian ports varied from 1.88 (Ajman port) to 145.85 (Shanghai port). Figure 7b shows that C5, which was mainly located in east Asia and the west coast of North America, had the highest average and median PLSCI. Almost all of the top 30 ports with the highest PLSCI were European and Asian ports, and 11 of the ports were in China (including Hong Kong, Macao, and Taiwan). As for the PLSCI of all ports, Figure 7a shows that the North American ports had the second highest average as well as the highest median PLSCI. The PLSCI of the Asian ports varied from 1.88 (Ajman port) to 145.85 (Shanghai port). Figure 7b shows that C5, which was mainly located in east Asia and the west coast of North America, had the highest average and median PLSCI.   The accessibility evaluation model was applied to calculate the inbound and outbound accessibility of ports in the GLSN; the top 30 ports are shown in Table 7. Singapore port had the highest inbound, outbound, and total accessibility amongst all ports in the GLSN. This was followed by Port Klang and Rotterdam. Tanjung Pelepas, Busan, and Shanghai port ranked 4-6 in total accessibility. In addition, the 30 ports all belonged to the first to fifth communities; the fifth community, which was mainly located in East Asia, had 11 ports in the ranking list.

Discussion
We constructed a directed GLSN and found that its in-degree and out-degree conformed to a power-law distribution, implying that a small number of hub nodes had a large number of links. Many studies consider the liner shipping network to be an undirected network, ignoring the directionality of the transportation flow. As shown in Table 3, Tanger Med port had a node degree of 72; its in-degree was 16 higher than its out-degree. Qingdao port had a node degree of 40, but its in-degree was 16 higher than its out-degree. These ports may be gateway ports that import goods into the hinterland. On the other hand, ports such as Algeciras and Le Havre, whose out-degrees were higher than their in-degrees, may be hub ports that export goods to all over the world. In addition, ports with a higher inbound accessibility have advantages of transit distance (time), port location, and port attractiveness, but they are at a higher risk of invasive species.
According to the annual review for maritime transport of UNCTAD, all containerized east-west trade routes among Asia, the Mediterranean, Europe, and North America account for 52.6% of the total freight volume of the world [44]. The community division result showed that the spatial distribution of the main communities conformed to the actual situation of the major routes of container transportation (the number of ports from community 1-5 mainly distributed in Asia, the Mediterranean, Europe, and North America accounted for 59.93% of the total ports in the GLSN). In addition, container port throughput in Asia and Europe accounted for 79.71% of the global throughput, according to the UNCTAD [44]; the same trend emerged in the port accessibility assessment results. Most of the leading 30 ports with the highest accessibility were Asian (17) and European (10) ports.
Regarding port accessibility, although the average PLSCI of the North American ports ranked second (behind Asia), there was no North American port in the total accessibility rank. The average betweenness centrality of the North American ports was relatively low and the transit times of both the trans-pacific routes and the North American-Europe routes were longer than the other major routes. This could be the reason why the overall accessibility of the North American ports was not high.
Regarding port management, an assessment of port accessibility can clarify the state of a port. Hub ports such as Singapore and Hong Kong should maintain their inbound and outbound accessibility at a similar level. Ports with a higher inbound or outbound accessibility such as Hamburg and Nansha should further develop their strengths and enhance their connectivity with the shipping network. For shipping companies, the accessibility of ports could also be useful when designing new routes. For example, a similar level of outbound and inbound accessibility for all ports in a route may reduce blank sailings and improve the efficiency of ships.
Although the data obtained were based on only six main liner companies, they were enough to illustrate the GLSN; the research result could easily be extended to a more detailed dataset. The transit time of the routes was the average value collected from the websites of the liner companies rather than the actual transit times, which may have had a slight impact on the calculation of accessibility. In addition, the data did not contain information such as the ship tonnage or ship carry capacity; therefore, the trade volume for routes was not considered in this study. Despite a few limitations, the GLSN developed to quantitatively analyze the inbound and outbound accessibility of global container ports could be used for subsequent studies.

Conclusions
The topological characteristics of the GLSN using Space-L from liner shipping companies provided a scale-free network, which indicated that few ports accommodated the majority of links. Community divisions into 10 clusters showed an obvious correspondence with the actual trade flow. The directed accessibility between the inbound and outbound trade flows significantly affected the topological structure. The accessibility evaluation result showed that the Asian ports had the highest total accessibility, with the inbound accessibility close to that of the outbound. The European ports ranked behind the Asian ports. The ports in North America had a relatively low accessibility because of the long transit time and low betweenness. Our research has enhanced the understanding of maritime networks and could provide insights into route optimization as well as other studies such as species invasion and port planning.
In the future, the research in this paper can be expanded in several ways. First, due to the availability of data, our analysis focused on the topographic characteristics of the GLSN. However, other indicators such as port throughput and port efficiency are worth studying. Second, the liner shipping data collected in 2021 reflected the shipping patterns in the post-COVID-19 era. However, the outbreak of war between Russia and Ukraine in 2022 has led to further changes in the patterns of global energy and food trade. It is possible to construct an updated long-term shipping network to analyze the impact of major international incidents such as COVID-19 or regional wars on maritime transport.