Reducing Risks by Transporting Dangerous Cargo in Drones

.


Introduction
The ground transport of dangerous goods or hazardous materials (hazmat) multiplies the risk of exposing populations and the environment to spills from handling or transportation mishaps [1]. Therefore, this research identified potential opportunities for future autonomous cargo drone fleets to carry hazmat. In addition to the safety benefits of eliminating the human operator and removing the vehicle from the ground environment, autonomous aircraft service will be faster and more efficient by avoiding ground traffic congestion and operating continuously. External benefits to the environment would be the reduction in road wear from heavy vehicle traffic and the reduction in road congestion, which would also reduce greenhouse gas emissions [2]. Initial successful deployments will encourage the public and regulators to speed up adoption by enacting new policies and standards that will scale commercial cargo drone services worldwide. Therefore, the goal of this research was to identify a minimal set of metropolitan areas where early cargo drone deployments could demonstrate the greatest initial benefits.
Cargo drone developments show steadily increasing levels of automation, heavy-lift performance [3], cost reduction [4], and enhanced safety with distributed electric propulsion [5] and integrated parachute systems [6]. Fleets of cargo drones can fly continuously, autonomously, and quietly [7] to meet the freight throughput demands. Hence, shippers in the commodity supply chain are now evaluating the potential for drones to complement or even replace ground transport modes for some types of commodities [8].
Despite many barriers to their adoption today [9], innovators and manufacturers are on track to automate drones so that they can fly safely without human intervention in beyond visual line of sight (BVLOS) conditions [10], and in all types of weather [11]. Meanwhile, manufacturers and governments are collaborating in an initiative known as Advanced Air Mobility (AAM) to overcome the technical and regulatory hurdles of safely integrating drones into the national airspace [12].
A contribution of this work is a new hybrid data mining (HDM) workflow that combined unsupervised machine learning (UML) and geospatial processing (GP). The UML part of the workflow analyzed the raw data of commodity flows to identify a minimal set of metropolitan areas with the greatest demand for transporting dangerous goods. The GP part of the workflow used geographical information system (GIS) techniques and spatial data to determine the geodesic distances between and within all pairwise combinations of metropolitan areas in the continental United States (CONUS). The technique revealed the potential demand for moving hazmat within various distance bands. The author could not find a similar hybrid analytical workflow within the literature.
The organization of the rest of this paper is as follows: Section 2 reviews the literature on AAM, the classification of dangerous goods, and its transport hazards. Section 3 describes the hybrid data mining (HDM) workflow developed to address the goal of this research. Section 4 discusses the analytical results and implications for stakeholders such as supply chain managers, shippers, urban planners, and policy makers. Section 5 concludes the research and suggests future work.

Literature Review
The three subsections of this literature review explore the status of AAM, define dangerous goods, and discuss related research on the risks of transporting dangerous goods.

Advanced Air Mobility
For certain high-value or urgently needed items, shippers generally exchange the lower cost of ground transportation for the higher speed and security of air transport [13]. The electrification and automation of drones can further reduce the operating costs and safety risks of air transport [14]. Cost savings due to time reduction is one of the main factors driving the drone package delivery initiative by Amazon [15]. Drones can also transport cargo across regions with poor roads, inhospitable terrain, jungles, waterways, or lakes [12]. Hence, cargo drones have recently been used in niche applications such as humanitarian logistics [16], emergency response [17], and the delivery of urgent items such as medical supplies [18] and replacement parts [19].
There has been a drastic increase in the number of articles about using drones for transportation [20]. Many large retailers and shippers have already been using drones for "last-mile" deliveries. For example, in 2022, Walmart announced that it is expanding delivery services from stores to homes and has been achieving average delivery times of 30 min [21]. Similarly, Amazon, DHL, and Federal Express are evaluating the use of drones to expand their next-day or same-day package delivery services [22]. Manufacturers are also developing heavy-lift cargo drones to address "middle-mile" opportunities [23]. In contrast to last-mile, middle-mile encompasses transport service between ports, transshipment facilities, distribution centers, sortation centers, fulfillment centers, warehouses, and stores [24].
In 2021, the U.S. National Aeronautics and Space Administration (NASA) released a concept of operations for Urban Air Mobility (UAM) to promote their vision of using drones to fly passengers and cargo over dense population centers [25]. The Federal Aviation Administration (FAA) released an accompanying concept of operations that expanded the UAM vision as Advanced Air Mobility (AAM) to include other use cases such as public services and recreational use [10]. Although there are still many unresolved risks for the safe integration of drones into the national airspace system [26], there have been many proposals [27] and safety demonstrations [28]. In anticipation of inevitable regulatory certi-fications, supporting infrastructure, and demand, hundreds of drone manufacturers have attracted billions of dollars in investments to commercialize their aircraft [11,29]. Analysts predict that the global cargo market for drones will reach $58 billion by 2035 [30,31].
FAA regulations will not allow commercial drone operations until service providers can demonstrate their safe operation in the national airspace [10]. Therefore, once certified, shippers will be confident that pilotless drones are safer than other modes of surface transportation because of the reduced conflict with traffic in the vast three-dimensional airspace, the impossibility of human errors, and advancements in air traffic management [12]. Subsequently, shippers can encourage a policy shift towards cargo drones by demonstrating safe operations in a few focused regions that maximize their safety, environmental, and economic impacts [32].

Defining Dangerous Goods
The USDOT Pipeline and Hazardous Materials Safety Administration (PHMSA) defines nine classes of dangerous goods [33,34]. Table 1 summarizes the classes of hazmat, some examples, their typical uses, and risks. Manufacturing of plastics and rubbers and agricultural uses such as fertilizers.

Transporting Dangerous Goods
The accidental release of hazmat during transport can adversely affect human health and damage the environment [38]. The Code of Federal Regulations (Title 49, Subtitle B, Chapter I, Subchapter C) regulates the transport of hazmat in the United States [39]. Despite such detailed and strict regulations, the United States experienced more than 189,000 accidents (non-pipeline from 2012 to 2021) involving hazmat that resulted in 1732 injuries, 83 fatalities, and nearly $827 million in property damages [33]. Most of the hazmat incidents occurred on roadways [40]. An external cost from accidents that involve hazmat is the loss of traffic capacity (economic productivity) from closed links during the incident investigation, environmental cleanup, and facility repairs.
The proximity of roadways and railways to vulnerable facilities such as gas stations, hospitals, large buildings, and schools can multiply the potential consequences of accidents involving hazmat [41]. Hazmat spills can contaminate water sources [42] and harm wildlife [43]. Nearby communities can become contaminated in cases involving the release of chemical, biological, radiological, or nuclear substances [44]. A recent study found that exposure to toxic chemicals can adversely affect learning in early childhood [45]. Another recent study found that exposure to pollution increases the risk of more severe infections and deaths from viruses such as COVID-19 [46].
Human factors have been the dominant cause of both railroad [47] and truck accidents [48]. Wei et al. (2021) found that weather, traffic signals, surface conditions, fatigue, and the time of day have strong associations with hazmat road accidents [49]. Future autonomous aircraft offer the potential to reduce the risk of hazmat transport by eliminating the vehicle operator and distancing the vehicle from populated places and the ground environment. Yet, a recent review paper found that there have been extraordinarily few studies of the potential to transport dangerous goods by aircraft [50]. The literature mainly covered risk assessment [51], traffic flow prediction [52], vehicle routing [53], tour length optimization [54], facility locating, scheduling, aircraft loading [55], emergency response [56], and accident cause analysis [40].
A recent survey found that there are currently no regulations that explicitly address the transport of dangerous goods by drones [57]. A few studies examined the risks of carrying certain types of hazmat by air. Hafeez et al. (2022) reviewed the use of drones to apply pesticides [58]. Lukežič et al. (2010) conducted a series of experiments on hexamethylenetetramine and ignition briquettes, which are chemically similar substances that had the same recommended packing group class for air cargo [59]. They found that exposure to elevated temperatures can cause spontaneous combustion in ignition briquettes and, therefore, policies should increase their packaging group class. Flammable liquids will require special containers for both ground and air transport [60]. Containers that eliminate the possibility of an uncontrolled fire can prevent the explosion of substances such as ammonium nitrate fertilizer [61].
The need for rapid response in emergency medicine has led to a proliferation of drone usage to carry certain life-saving items [62]. Studies have shown that active cooling boxes attached to drones can maintain the integrity of biological samples [63], human organs [64], and adrenaline auto-injectors used to treat anaphylaxis [65]. Zipline in Rwanda has grown into a service that uses drones to deliver blood and related medical products to remote health facilities [18].
In summary, there is no other work in the literature that considered opportunities to leverage the future capabilities of AAM to transport dangerous cargo. The literature does not describe a similar data mining workflow that combined methods of transforming origin-destination commodity flow data to reveal insights about outlier locations that currently move the largest amounts of dangerous cargo. The literature does not discuss how to select distance bands that would address the maximum potential demand in the fewest locations that future cargo drones could serve. Figure 1 illustrates the HDM workflow with the procedures, as coded in software, and their interactions. The data layer of the workflow used the Freight Analysis Framework (FAF) dataset because it is the most comprehensive source of multimodal commodity flows available for locations within the United States [66]. The FAF combines shipping data from the Bureau of Transportation Statistics (BTS) of the U.S. Department of Transportation (USDOT) and the U.S. Census Bureau (USCB) [67]. Each row of version 5.2 of the FAF dataset lists a commodity category moved between pre-defined zones (FAF zone) by transport mode category (air, rail, truck, etc.), weight in thousand-ton units, and value in million U.S. dollar (USD) units. The generalized design of the workflow enables analysts to apply it to any available dataset of commodity flows elsewhere in the world.

Methodology
The next three subsections describe how the HDM workflow summarized the weight of commodity categories moved in each FAF zone of the CONUS, determined a suitable category of dangerous goods for the case study, identified clusters of demand for multiple ground transportation modes, and created a histogram of the weight moved within a series of distance bands.

Pivot Table by Weight
The "Extract" procedure partitioned the data into three modal subsets of commodity flows: truck, rail, and air. The mode category code served as a key for the software logic to filter and separate the rows of data accordingly. The procedure "Pivot Table by Weight" converted raw origin-destination movement data into tables that summarize the weight by origin and destination in every FAF zone for each of the commodity categories in a modal data subset. The origin and destination pivot tables listed 130 FAF zones along the rows and 43 commodity categories across the columns. Movements by air had the same number of FAF zones as for trucks and rail but two fewer (41) commodity categories. Combining the origin and destination pivot tables produced a summary of the weight moved by commodity category in each FAF zone. In addition to summarizing the data, the pivot table procedure helped to detect errors such as missing values, incorrectly coded features, and any outlier values that were incorrect.

Dangerous Goods Category
An analysis of the FAF 5 metadata description revealed that many of the items in the commodity categories of crude oil, fuels, pharmaceuticals, and basic chemicals are within the dangerous goods classes defined in Table 1 above. This analysis considered the commodity category of basic chemical materials (BCMs) for the U.S. case study because it contains most of the dangerous goods transported without pipelines. BCMs defined in the

Pivot Table by Weight
The "Extract" procedure partitioned the data into three modal subsets of commodity flows: truck, rail, and air. The mode category code served as a key for the software logic to filter and separate the rows of data accordingly. The procedure "Pivot Table by Weight" converted raw origin-destination movement data into tables that summarize the weight by origin and destination in every FAF zone for each of the commodity categories in a modal data subset. The origin and destination pivot tables listed 130 FAF zones along the rows and 43 commodity categories across the columns. Movements by air had the same number of FAF zones as for trucks and rail but two fewer (41) commodity categories. Combining the origin and destination pivot tables produced a summary of the weight moved by commodity category in each FAF zone. In addition to summarizing the data, the pivot table procedure helped to detect errors such as missing values, incorrectly coded features, and any outlier values that were incorrect.

Dangerous Goods Category
An analysis of the FAF 5 metadata description revealed that many of the items in the commodity categories of crude oil, fuels, pharmaceuticals, and basic chemicals are within the dangerous goods classes defined in Table 1 above. This analysis considered the commodity category of basic chemical materials (BCMs) for the U.S. case study because it contains most of the dangerous goods transported without pipelines. BCMs defined in the FAF are further subclassified as organic and inorganic chemicals. Examples of organic chemicals include ethyl benzene, paraformaldehyde, acyclic alcohols, peroxides, hydrocarbons, hormones, and toners. Examples of inorganic chemicals include hydrochloric acid, sulfuric acid, hydrogen, rare gases, alkali metals, mercury, and radioactive elements.

Regional Demand Cluster
The FAF zones include metropolitan statistical areas (MSAs) and areas outside of MSAs, which were the remainder of states or entire states. The analysis focused on MSAs for initial deployments because any localized disruptions in their large population centers and trade gateways will amplify the risk of harm to society and increase the risk of supply chain disruptions. A procedure flagged FAF zones as MSAs where the metadata description contained the words "area" or "part" so that the "extract" procedure produced a list of only the MSA movements by weight. The procedure identified 83 of the 130 FAF zones as MSAs in the CONUS.
The next objective was to identify a minimal set of MSAs that moved the largest amount of BCMs. It is not possible for humans to visually identify clusters or outliers when a data element has more than three features. Hence, the strategy to generalize the workflow was to apply unsupervised machine learning (UML) to detect MSA clusters and outliers based on the weight of BCMs moved by all transport modes.
The HDM workflow used three of the most popular UML methods, which are clustering algorithms: the density-based spatial clustering of applications with noise (DB-SCAN) [68], Louvain [69], and k-means clustering [70]. Table 2 summarizes their basic theory of operation, their advantages (A), and their disadvantages (D). The hyperparameters are algorithm parameter values that the user must provide based on cyclically observing intermediate results from a range of settings. The table shows the values of those hyperparameters that provided the best performance for this dataset. The next step in the workflow extracted the demand outliers to calculate the distance band distribution.

DBSCAN
Density-based spatial clustering of applications with noise (DBSCAN). Separates densely packed points from outliers. Initializes core points as those that are within distance d of k points. Grows a cluster by randomly labeling a core point as a cluster, and then grows that cluster by sequentially adding other core points that are within distance d until all core points are assigned to a cluster. Finally, it assigns non-core points to clusters that are within distance d.

Distance Band Creation
The GP part of the workflow determined the inter-MSA geodesic distances and estimated intra-MSA distances. Establishing cargo drone hubs near the center of MSAs would minimize the connecting first-mile and last-mile distances within the MSA. The tactic to determine the inter-MSA distances was to utilize a dataset containing the areas of all U.S. counties, combine the areas of counties that are part of an MSA, find the centroid of each MSA, and then compute a distance matrix that accounted for all pairwise combinations of the centroids. The TIGER ® shapefile maintained by the USCB contains both the land and water areas of all U.S. counties [71]. The 2017 commodity flow survey geographies database, also maintained by the USCB, contained a FAF zone identifier for every county in the United States [72]. The merge procedure is a data manipulation technique that adds more variables to a record by combining data from two or more records that may be sourced from different locations [70]. However, a successful data merge requires that each record have a common variable called the key. In this workflow, the county code (FIPS5) served as a key that added the FAF zone identifier to the TIGER ® shapefile. That enabled a GIS dissolve procedure to define contiguous MSAs and to aggregate the areas of their counties. Next, a GIS centroid procedure calculated the center of each MSA as shown in Figure 2, which then enabled a GIS distance matrix procedure to calculate the geodesic distances among the MSA centroids.

Results and Discussions
The following subsections discuss the results of the regional demand clustering and the distance band distribution of BCM weight moved. Figure 3 shows the results from the three clustering algorithms but using the top two modes for ease of visualization. The vertical and horizontal positions of each point (circle) in the figure represent the total weight of BCMs moved in 2017 by truck and rail, respectively. Each point on the scatter plot represents an MSA that moved the indicated weight by the two modes. The label of each colored point in the legend indicates the cluster number to which the algorithm assigned them. Table 3 lists the tuned hyperparameter settings for each clustering algorithm.

Regional Demand Cluster
Visually, the scatter plot shows one extreme outlier in the top right (Houston, TX), Unfolding the distance matrix and merging it with the commodity flow data added a geodesic distance for flows between MSAs. However, the distance matrix cannot produce a value for intra-MSA flows. Therefore, the tactic was to estimate an average inter-MSA distance based on a direct proportion of the MSA size, which was half the diagonal of a square with an area equal to that of the MSA. Figure 2 shows that the GIS dissolve procedure produced 83 MSAs and 46 remaining FAF zones, along with their geospatial centroids. Finally, a histogram procedure binned the weight of BCM moved into 100-mile distance bands from 100 to 2800 miles.

Results and Discussions
The following subsections discuss the results of the regional demand clustering and the distance band distribution of BCM weight moved. Figure 3 shows the results from the three clustering algorithms but using the top two modes for ease of visualization. The vertical and horizontal positions of each point (circle) in the figure represent the total weight of BCMs moved in 2017 by truck and rail, respectively. Each point on the scatter plot represents an MSA that moved the indicated weight by the two modes. The label of each colored point in the legend indicates the cluster number to which the algorithm assigned them. Table 3 lists the tuned hyperparameter settings for each clustering algorithm. Visual validation suggested that DBSCAN provided the best performance for this application by identifying all the outlier MSAs. The two-dimensional visualization was effective because the weight of BCMs moved by air was substantially less than the weight moved by trucks and rail. Figure 4 shows the proportional distribution of the total BCM weight moved by the three modes of truck, rail, and air in each MSA. The figure shows the MSAs by total weight moved, sorted with the highest at the bottom. As labeled in the  Visually, the scatter plot shows one extreme outlier in the top right (Houston, TX), one dense cluster in the lower left, and a set of outliers towards the middle. The outliers are distinctively separated from the high-density cluster in the lower left. DBSCAN correctly identified the outliers. Louvain identified nine clusters, with its C5 cluster mostly agreeing with DBSCAN. k-means treated the extreme outlier as its own cluster but included a few edge points of the high-density cluster into the outlier group as a third cluster.

Regional Demand Cluster
Visual validation suggested that DBSCAN provided the best performance for this application by identifying all the outlier MSAs. The two-dimensional visualization was effective because the weight of BCMs moved by air was substantially less than the weight moved by trucks and rail. Figure 4 shows the proportional distribution of the total BCM weight moved by the three modes of truck, rail, and air in each MSA. The figure shows the MSAs by total weight moved, sorted with the highest at the bottom. As labeled in the figure, only 9 of 83 MSA in the CONUS moved nearly half the weight (49.3%) of the BCMs. The long distribution tail of Figure 4 further illustrates that beyond those initial 9 MSAs, there are diminishing returns on the weight of the BCMs that cargo drone deployments can move in each additional location.
Nine of the MSAs among the DBSCAN outliers are in only four states: California (Los Angeles, San Francisco), Texas (Houston, Dallas-Fort Worth, Beaumont), Louisiana (Baton Rouge, New Orleans, Lake Charles-Jennings), and Illinois (Chicago). This result is fortuitous because focusing deployment plans within only a few states will ease the burden of working across many jurisdictions that may initially have different constraints, policies, and regulations for commercial cargo drone services. Focused deployments in only a few states can also help to minimize the risk of supply chain disruptions from localized weather events such as hurricanes, tornados, and extreme cold that can damage utilities and block surface transportation routes.

Distance Band Distribution
To put the weight moved into perspective, a typical North American semi-trailer truck (18-wheeler/big rig) carries 45,000 pounds (22.5 tons) of cargo [73]. Table 4 lists the distance band distribution of truckload equivalents and their proportion for the selected outlier MSAs, and Figure 5 plots the data for visualization. Current projections based on the anticipated improvements in battery technology suggest that cargo drones will exceed 400-mile ranges well before 2050 [74]. As shown in Table 4, air, rail, and trucks accumulated (acc.) to 48.9%, 53.3%, and 84.7% of all BCM movements, respectively, within 400 miles. Figure 5 reveals that for each transport mode, there was a distinct point of diminishing returns in the accumulated proportion after 400 miles. Therefore, cargo drones with a robust 400-mile range have the potential to remove nearly 85% of the trucks that carry BCMs across the nation's roadways. Based on growth estimates from the National Freight Strategic Plan, truck transport will increase by 35% by 2040 [75], which means that the potential benefits will be even greater over time.
there are diminishing returns on the weight of the BCMs that cargo drone deployments can move in each additional location.   The above results demonstrate the practical value of the workflow in providing insights for decision making under high uncertainty. The data mining workflow transformed a large origin-destination database into a tabular summary that enabled further merging with GIS data to add geodesic distance information estimated from the spatial geometry of maps. Logistical managers and investors can adopt the workflow without concerns about using non-standard or unproven functions because it incorporated mature data science methods like pivoting, extraction, merging, histogram, clustering, GIS dissolve, and GIS centroid.
A limitation of this study is that it focused on identifying areas with significant amounts of hazmat movements that would be worthwhile locations for drone deployments but did not examine possible engineering limitations. For example, new policies would need to consider the design and availability of special containers of suitable weight and material for the proposed cargo drone applications. Engineers will need to address how the transport system will attach and release those containers, and how to design redundant systems that will enable safe landing in an emergency. Policymakers will also need to determine the population density threshold and minimum altitude for routes that need to cross populated places.
As noted above, only nine MSAs in four U.S. states accounted for 49.3% of the BCM weight moved and 76.3% of that weight moved within a 400-mile range. Therefore, deploying cargo drones in the MSAs of only four states have the potential to carry 49.3% × 76.3% = 37.6% of all the BCM weight moved in the CONUS. The equivalent in truckloads displaced in 2017 would have been 4.7 million. The above results demonstrate the practical value of the workflow in providing insights for decision making under high uncertainty. The data mining workflow transformed a large origin-destination database into a tabular summary that enabled further merging with GIS data to add geodesic distance information estimated from the spatial geometry of maps. Logistical managers and investors can adopt the workflow without concerns about using non-standard or unproven functions because it incorporated mature data science methods like pivoting, extraction, merging, histogram, clustering, GIS dissolve, and GIS centroid.
A limitation of this study is that it focused on identifying areas with significant amounts of hazmat movements that would be worthwhile locations for drone deployments but did not examine possible engineering limitations. For example, new policies would need to consider the design and availability of special containers of suitable weight and material for the proposed cargo drone applications. Engineers will need to address how the transport system will attach and release those containers, and how to design

Conclusions
Autonomous cargo drones are emerging as a new mode of air transportation that has the potential to displace the surface transportation of many types of commodities. The removal of human operators not only reduces cost but also increases safety by eliminating the potential for human error or harm. Based on the number of global incidences in recent years, air transportation has been far safer than other modes of surface transportation such as truck and rail. Federal regulations will ensure that all commercial cargo drones can safely integrate into the national airspace. Autonomous cargo drones can fly rapidly, in all types of weather conditions, as fleet swarms, and at all hours to support the demand for freight capacity. Whereas surface transportation modes are subject to weather events and accidents that can cause congestion and route closures, cargo drones can fly unimpeded and more directly between terminals. Electrified vertical takeoff and landing (eVTOL)-type cargo drones do not need airports because they can use vertiports atop buildings and in small open areas in metropolitan areas. Battery-powered drones will eliminate harmful emissions if they can charge from clean energy sources.
The future realization of advanced air mobility (AAM) presents an opportunity to move dangerous goods by air and thus minimize the risk of harm to people and the environment. This applied research developed a new hybrid data mining and machine learning workflow to identify the fewest initial locations where cargo drone deployments can yield the greatest benefits. Logistical managers and investors can adopt the workflow to analyze combined commodity flow and GIS datasets to produce insights for decision making under high uncertainty. A case study of the workflow on data from the United States found that deployments in only nine metropolitan areas in four states can move 38% of all basic chemicals within 400 miles. The implication is that initial success will demonstrate their safety and efficiency benefits to guide policy making and new logistical standards for transporting dangerous goods. Future work will utilize the generalized data mining workflow to study the potential markets for moving other types of commodities such as pharmaceuticals and perishable items that are vulnerable to supply chain disruptions. Data Availability Statement: Some or all data, models, or code used during the study were provided by a third party. Direct requests for these materials may be made to the provider cited in the manuscript.

Conflicts of Interest:
The author has no competing interest to declare that are relevant to the content of this article.