A Framework of an Integrated Livestock Vehicle Trajectory Database Using Digital Tachograph Data

: The outbreak of African swine fever virus has raised global concerns regarding epidemic livestock diseases. Therefore, various studies have attempted to prevent and monitor epidemic livestock diseases. Most of them have emphasized that integrated studies between the public health and transportation engineering are essential to prevent the livestock disease spread. However, it has been difﬁcult to obtain big data related to the mobility of livestock-related vehicles. Thus, it is challenging to conduct research that comprehensively considers cargo vehicles’ movement carrying livestock and the spread of livestock infectious diseases. This study developed the framework for integrating the digital tachograph data (DTG) and trucks’ visit history of livestock facility data. The DTG data include commercial trucks’ coordinate information, but it excludes actual livestock-related vehicle trajectories such as freight types and facility visit history. Therefore, the integrated database we developed can be used as a signiﬁcant resource for preventing the spread of livestock epidemics by pre-monitoring livestock transport vehicles’ movements. In future studies, epidemiological research on infectious diseases and livestock species will be able to conduct through the derived integrating database. Furthermore, the indicators of the spread of infectious diseases could be suggested based on both microscopic and macroscopic roadway networks to manage livestock epidemics.


Introduction
The epidemic of African swine fever virus has led to an increasing global interest in epidemic livestock diseases. The World Organization for Animal Health has focused on this issue and designated research each year [1][2][3]. Furthermore, many other studies have proposed methods to prevent and control epidemic livestock diseases. [4][5][6] As mentioned in many previous studies, the spread of livestock epidemics is inseparable from road mobility [7][8][9]. Thus, convergence research in health sciences and transportation engineering is required to control infectious diseases. Accordingly, countries seek ways to prevent the spread of infectious diseases in terms of transportation engineering through a system that stores and analyzes information regarding the traffic trajectories of livestock-related vehicles [10][11][12]. However, it is impossible to obtain a clear trajectory for livestock-related vehicles, as South Korea only manages information regarding the entry and exit information of vehicles focused on livestock facilities. Moreover, there have been no convergence research cases for health science and transportation engineering for effectively controlling the spread of infectious diseases or of experts competent in both fields simultaneously.
South Korea has significant experience with the spread of livestock epidemics. The amount of damage involved accounts for 76% of total social disaster damages until 2018 [13].
In addition, when the outbreak of the foot-and-mouth disease occurred in 2011, approximately 87.2% of the propagation factors were found to be livestock vehicles, leading some studies to identify vehicle movement as the largest infectious agent. Although South Korea has developed various transportation infrastructures for its high population density, it is easy to spread infectious diseases due to its small land size. Thus, in terms of transportation, additional studies are needed to prevent the spread of livestock epidemics, based on analyzing the mobility of livestock-related vehicles. Currently, the South Korean government collects the mobile data of all business cargo vehicles (including livestockrelated vehicles) and generates a real-time location database (DB) in the form of big data through the mandatory installation of digital tachograph (DTG) devices. However, it is impossible to extract only livestock-related vehicles from the DTG data, as this DB only collects traffic trajectories and driving behaviors and does not provide the type of cargo loaded on the vehicle. Therefore, this study attempted to estimate livestock transport vehicles' actual trajectories on roadways by matching the DTG data with actual visit history data from livestock facilities. Thereby, we derived roadway segments and areas with a high risk of spreading infectious diseases. The livestock-related vehicle data matched in this study can be used as primary data for research on the spread of livestock diseases. Moreover, it can be used to derive infectious disease propagation patterns for each livestock breed in the future. Furthermore, it can provide critical basic data to government managers in order to establish strategies for pre-monitoring and post-monitoring.
To prevent the spread of livestock epidemics, South Korea has enacted a policy of conducting the same quarantine procedures for all livestock within a specific range from any affected farms (Livestock Infectious Disease Prevention Act). For example, the Korean government rates zones as controlled, protected, and forecast, depending on the distance from farms where the epidemic has occurred, according to the Manual for foot-and-mouth disease prevention. In the event of an epidemic, all affected livestock in the restricted and protected areas are slaughtered regardless of infection, whereas farms in the forecast area are required to restrict the movement of livestock products and ban entry. Owing to collective killing, plantation owners must also kill noninfectious livestock [14], resulting in a total loss of 2 trillion won [15]. In addition, there are other adverse effects on the local economy, owing to the mental damage to farm owners or the restricted movement of vehicles [16]. As such, collective control and disposal measures are inefficient and cause significant damage [13]. Hence, strict standards and research are required to establish an improved epidemiological relationship for analyzing and preventing the spread of livestock epidemics.
The studies that have conducted simulations of livestock epidemics' propagation behaviors have identified livestock-related vehicles as significant causes of epidemic propagation [7,9]. It shows that the main factor in the spread of livestock epidemics is the travel of people and vehicles [8]. Therefore, to prevent the spread of livestock epidemics, a solution must be found while considering the transportation sector. Choi et al. [17] derived the propagation speed of foot-and-mouth disease by analyzing a vehicle movement network, and Miranda et al. [18] analyzed the transportation routes of livestock vehicles to show the various advantages of specific routes (in addition to preventing the spread of infectious diseases), e.g., preserving the status of livestock products or reducing transportation costs.
In the past, research on the spread of livestock diseases has been weak in the microspatial areas of road units, as detailed DBs on individual livestock-related vehicles have not been obtained. Recently, however, big public data have become available, and it is easy to collect DBs with the mobile information of each vehicle by requiring the government or local governments to install GPS on livestock vehicles. GPS data should be collected for each livestock-related vehicle to analyze vehicles' movement behaviors and establish appropriate quarantine policies, as recommended by Korean MAFRA (Ministry of Agriculture, Food and Rural Affairs) since 2012 [19]. The US Animal and Plant Health Inspection Service has also established a system for monitoring movements between objects through the National Animal Health Monitoring System, a monitoring system for livestock diseases [11]. The Rapid Analysis and Detection of Animal-Related Risks (RADAR) System were established in the case of the UK [12,20]. In Denmark, the Central Husbandry Register (CHR) system aggregates breed and GPS data [10,21]. As such, some countries encourage the establishment of a vehicle monitoring system to prevent livestock epidemics and collect relevant databases. However, many countries have not been established as monitoring and database systems to analyze moving data for livestock-related vehicles.
In South Korea, the government aggregates livestock-related facility visit history data on livestock-related vehicles but does not compile detailed information on the road that each vehicle uses when moving. The Korean government also provides DTG data that aggregate commercial cargo vehicles' location coordinates in a 1-s time interval. However, the data do not distinguish between the vehicles' load items, making it difficult to extract only livestock-related vehicles' trajectories. In other words, there are no coordinates of the vehicle in the visit history data, and there is no information in the DTG data on which cargo vehicles are related to the livestock.
Therefore, this study aims to identify livestock-related vehicles among freight vehicles and extract the trajectory of such vehicles by matching livestock-related facility visit history data with DTG data. It also aimed to establish livestock-related vehicles' origin and destination matrix using visit history data to analyze the network connection of the spread of livestock diseases in the future. Through this study, the actual routes of livestock-related vehicles could expect to be identified, and the types of livestock disease products that are loaded and used could be derived. In many countries, DTG or GPS are installed in commercial cargo vehicles. The data collected from these equipment do not provide information on what cargo vehicles are carrying, but only livestock transport vehicles can be extracted by the data matching methodology presented in this study. The advantage of tracking the trajectory of these extracted livestock transport vehicles is that they can find out which livestock facilities they frequently visit, which can predict the propagation route in the event of an epidemic. We believe that these technologies will be useful in Korea and in countries that do not apply real-time monitoring systems for livestock transport vehicles.

Necessity of Methodology
Developing integrated livestock-related data collection systems and infrastructure has already been under government leadership in the livestock sector, such as the UK RADAR system, Denmark's CHR system, and the US NAHMS to take appropriate control in the event of an epidemic [10][11][12]20,21]. Such a system is used to research and develop livestock diseases and minimize social damage in the livestock sector in the event of a disaster by preventing the livestock diseases and preventing the spread of livestock diseases. However, most countries do not have such a system. Even countries such as China and Brazil, which have the world's first and sixth-largest pork production, do not have separate livestock management databases or data collection systems [22]. This is due to the fact that data collection and integrated management systems at the national level are expensive to build and require a high technology level for deployment and management.
For this reason, this work aims to present a methodology for building integrated databases using existing collected data rather than new systems. This will enable the existing data collection system to derive integrated livestock-related data and be used efficiently in the transition phase for advanced data management systems.

Data Description
This study conducted a case study on Gyeonggi-do, where south Korea's livestockrelated facilities are concentrated. The analysis was based on visit history data provided by the Korea Animal Health Integrated System, a six-level road network provided by the Korea Transport Database, and DTG data provided by the Korea Transportation Safety Authority. Tables 1-3 below show the structure of each data collection as an example. The visit history data comprise data aggregating the time of access by the vehicle, type of livestock being loaded, the purpose of driving, and the facility's address when the livestock-related facilities registered in each livestock-related facility are accessed. A total of 2,499,788 data in 1 month, from 1 December 2017 to 31 December 2017, were used for the analysis. The statuses of the 11,472 facilities and 117,404 links in Gyeonggi province are shown in Figure 1, and the frequency of operation for a total of 34,634 livestock vehicles is shown in Table 4, respectively.
The DTG data aggregate information at intervals of 1 s, while commercial freight vehicles drive. The collecting information includes the location, time, and speed on the GPS, 40 billion rows were collected as of December 2017 for this study.

Building Process in the Integrated Data
As shown in Figure 2, this study's primary purpose is to extract information from the DTG data of actual livestock-related facilities. However, as the DTG data size is considerable, it is necessary to preprocess the data to eliminate unnecessary variables and extract only the data for cargo vehicles. Therefore, data for commercial vehicles of buses, trucks, and taxis were preprocessed. Parallel processing was performed by dividing it into 200 datasets to speed up the process of preprocessing big data, and only the DTG data of cargo vehicles were extracted and accumulated in the computing system [23][24][25]. The addresses of livestock-related facilities in the visit history data were converted to coordinates through geocoding to obtain livestock-related facilities' locations. A spatial analysis was conducted on the livestock-related facilities' coordinates and DTG data of the extracted cargo trucks. The visit history data were used to extract a list of visited facilities by vehicle. Based on this list, the visit history data and DTG data were matched by each vehicle ID. Finally, the integrating data, including the road links used by the livestock-related vehicles and purposes of driving and loading the vehicles, were presented for each origin and destination pair between the livestock-related facilities. The process for integrating the database is illustrated in Figure 3.

Integrated Database (DB) Establishment for the Analysis of Livestock-Related Vehicles
The integrating method proposed in this study is based on a simple principle. However, there is the practical and scientific reason. This study defined the "livestock-related activity" as the time when the cargo truck stops for more than 5 min within 100 m of the livestock-related facility coordinates to extract livestock-related vehicles. The 5 min was derived to an equivalent to the average time of loading and unloading of livestock through the field survey. Additionally, the trip keys of cargo vehicles that carried out livestock-related activities more than two times a day were extracted to reflect the behavior of livestock loading and unloading and were considered livestock-related vehicles. The results of these criteria can be statistically demonstrated, but the consultation from relevant experts also supported the conclusion that the criteria derived from this study are practically valid. The ratio of livestock-related vehicles to registered cargo vehicles was compared with the ratio of livestock-related vehicles derived from the DTG data to verify the matching results for livestock-related vehicles derived from this study. Table 5 shows the ratio of the 5441 livestock-related vehicles (5.25%) to the 103,700 cargo vehicles registered in Gyeonggido Province, along with the ratios of the livestock-related vehicles as extracted from among the daily passing cargo vehicles recorded in the DTG data. The livestock-related facilities data were based on stops for 2, 5, and 10 min, respectively. It was found that the ratio of livestock-related vehicles to all the cargo vehicles, when stopped for 5 min, was similar to that of the registered vehicles. When the livestock-related vehicles in the visit history data stopped at each facility, the time frequencies were calculated to prove the results. Vehicles stopped for more than 5 min, as shown in Figure 4, which accounted for 96.5% of all the vehicles. The trajectory of a vehicle with an origin-destination pair visiting more than two livestock-related facilities (as extracted on a 5-min basis) was visualized, as shown in Figure 5. To build a database that integrates facility information from visit history data with location information from DTG data, the three attributes were matched: The route through which the vehicle visited the facility in the DTG, attribute values by the facility, and information regarding the links actually used. First, the vehicle trajectory and facility locations in the visit history data were matched to check the details of the facilities visited by each vehicle and produce a route comprising the facilities visited, as shown in Figure 6.
The pathways for livestock-related vehicles shown in Figure 6 are, in order, 1→2→3→4. Second, we analyzed the visit history data and compiled attributes for each livestock-related facility, including the type of facility, purposes of visiting vehicles, and breeds. Finally, the data on the movement trajectories of livestock-related vehicles were matched to road links to estimate the road links used by the livestock-related vehicles, and the time spent on the roads was aggregated. This analysis resulted in developing an integrated DB with 279,135 rows as partly shown in Table 6 for matching the origin-destination of a facility's location, the road links used, and passage and occupancy times.

Utilization of Analysis Results and Future Research
This study visualized the database's travel time by the link and compared it with the spread of foot-and-mouth disease and avian influenza in Korea as of 2011 to verify the integrated database's effectiveness, as shown in Figure 7 [26]. It can be seen that foot-andmouth disease and avian influenza have spread near Gyeonggi-do, Gangneung-si, and Haenam-gun, which have a long exposure time to livestock-related vehicles.
Through the study, livestock-related vehicles' actual movement trajectory was derived, and based on this, areas with a high risk of spreading livestock diseases were also identified. We expect that it will be possible to find the propagation pathway of infectious diseases by advancing the integrated database building model presented in the study. It is also predicted that it will be possible to analyze the trajectory of livestock-related vehicles used in the study in considering the characteristics of roads operated by the vehicle, such as traffic volume, number of lanes, and road grades. Finally, the topology information based on each livestock facility's physical network will be applied to build a model that reflects the relationship between livestock facilities.

Conclusions
A road mobility analysis of livestock-related vehicles is essential for efficiently preventing the spread of livestock epidemics. However, current animal disease prevention procedures are conducted based on a unified procedure, without any analysis related to livestock-related vehicles' road mobility. Moreover, livestock-related vehicle trajectory data required for accurate analysis are not considered.
As big data have recently become open to the public, DTG data, including cargo vehicles' travel information by microscopic temporal and spatial scopes and visit history data collecting information on entering and leaving livestock-related facilities, have become available. The analysis of such data for livestock disease prevention has become convenient through various programming tools.
Therefore, the purposes of this study were to use plausible logic to collect relevant data and integrate information on the road links used by livestock-related vehicles and the livestock-related facilities visited by livestock-related vehicles, to establish a necessary DB for analyzing the spread of livestock epidemics. Of the 5146 vehicles, which did "livestockrelated activity" extracted using the criteria proposed in this study, physical networks were formed through the actual road links of 670 freight vehicles visiting more than two livestock-related facilities. In addition, a unified DB of 279,135 trajectories was established by matching vehicle location information from DTG data with livestock-related facility information from visit history data.
The spread of livestock epidemics is closely related to the mobility of livestock-related vehicles but establishing a system to collect and manage the movement trajectory of all livestock-related vehicles at the national level is costly. This study proposes a methodological framework for integrating the DTG data with visit history data to form a DB covering livestock-related vehicles' actual movements to propose a system that can utilize the existing systems. The next step is to quantitatively estimate the risk of the spread of livestock diseases in roadway links by analyzing how many livestock-related vehicles pass through each link. It is essential to establish an integrated DB performed in this study to quantify the risk by a roadway link. The integrated DB provides travel information of livestock-related vehicles in microscopic temporal and spatial units. It is expected that the use of the integrated livestock-related vehicle DB will pave the way to prevent the spread of livestock epidemics. Although the DTG data were aggregated for commercial vehicles when there were not enough samples for analysis, it could be possible to further integrate the navigation data to build and analyze a DB, including the entire set of livestock-related vehicles. It is also expected that epidemiological studies on infectious diseases targeting specific breeds will be possible through the DB information on the types of livestock-related to facilities.
Since this study applied South Korea to a spatial extent, it is necessary to readjust the criteria for extracting livestock transport vehicles from the integrated DB to apply to countries where the geographical location characteristics of livestock facilities are different. Therefore, future research on the applicable criteria worldwide will be required. Furthermore, future studies will be able to calculate the occupancy time for each road link used by livestock-related vehicles and present risk exposure indicators related to the spread of infectious diseases. Data Availability Statement: Restrictions apply to the availability of these data. Data was obtained from Korea Transportation Safety Authority and are available at http://www.kotsa.or.kr (accessed on 21 January 2021) with the permission of Korea Transportation Safety Authority.

Conflicts of Interest:
The authors declare no conflict of interest.