1. Introduction
With the development of drone technology, drones have proliferated around the world. The embedded device of the drone, equipped with wireless communication capabilities and tethered to the controllers, ascends into the sky to perform aerial tasks, earning the moniker “Mobile Internet of Things” [
1]. Nowadays, drones are useful across diverse domains encompassing law enforcement [
2], public safety operations [
3], search and rescue missions, agricultural practices, and recreational pursuits. Projections indicate that, by the year 2025, the drone market is poised to yield a revenue stream of
$18 billion, underscoring the significance and commercial viability of this transformative technology.
However, it is crucial to note that drones, while possessing significant potential for beneficial applications, are not immune to exploitation for fewer benign purposes. The small size and remote operability of drones allow them to access otherwise inaccessible places, a feature that has been useful in military operations and conflicts. The increase in commercial drones has raised significant concerns regarding their misuse in criminal activities. It has gained urgency due to several notable incidents. For instance, in July 2021 [
4], an Italian prisoner shot at fellow inmates with a weapon believed to have been smuggled in by a drone through the bars of their cell. In April 2022 [
5], a tourist-operated drone crashed onto the roof of Rome’s historic Palazzo Venezia. The drone was seized during search measures, and the tourist now faces potential criminal prosecution. In June 2022 [
6], a drone was reported to have hindered the approach of a passenger plane to the capital’s BER airport. The litigants in these cases all used DJI drones, which are the same series studied in this paper. Due to the proliferation of commercial drones and the surge in criminal activities involving them, there is an urgent need to develop reliable drone forensics technology. We anticipate a growing need as drones become more readily available and used in various criminal activities. In the above cases, investigators need to analyze drone-related data to restore the facts, which is crucial for preventing illegal activities and ensuring flight safety.
In criminal cases, such as drug smuggling and contraband transportation, law enforcement officials may choose to forcibly shoot down the drone. However, identifying the drone user can be a significant challenge for forensic analysis. During the past 15 years, there has been a growing interest in digital forensic research to anticipate the potential misuse of drones in different criminal contexts. Drones possess several data interfaces, including internal, external, wired, and wireless [
7], which makes them vulnerable to external attacks. The vulnerabilities enable attackers to obtain access to a drone’s internal data either physically or through remote means. Consequently, attackers can forge and inject messages that may influence the decision-making processes and control of the drone’s internal system potentially causing drone accidents. For instance, suspects might orchestrate drones in restricted airspace or deviation from designated flight routes. As intricate electronic systems with networking capabilities, drones and their associated apparatuses store data encompassing flight patterns, GPS coordinates, and user profiles. In adjudicating criminal proceedings, legal authorities are tasked with amassing evidentiary material and subjecting it to thorough scrutiny to ascertain the veracity of allegations.
The process of digital forensics involves investigators uncovering the truth in criminal cases by examining data in digital systems. According to the National Institute of Standards and Technology (NIST), the digital forensic investigation is typically structured into four main phases: collection, examination, analysis, and reporting [
8]. On the other hand, drone forensics focuses on scientifically analyzing flight data, user information, and other relevant data stored in drones and associated equipment to establish evidence. In January 2022, the NIST and the Scientific Working Group on Digital Evidence (SWGDE) introduced the Best Practices for Drone Forensics [
9]. This publication defines system requirements for the storage, processing, equipment, extraction, and analysis of drone data. Conversely, in October 2023, the Ministry of Public Security of China issued a technical specification for examining digital data in civil drones [
10]. The guidelines outline the requirements for extracting, analyzing, and identifying data from civil drones, with the standard coming into effect in December. Despite ongoing advancements in the field of digital forensics, numerous challenges remain to be addressed.
This work presents a comprehensive study of the forensic methods for the DJI Mavic 2 Pro, including an analysis of its internal structure and working mechanism. The main contributions of this research are as follows:
We first conduct an analysis of the internal storage structure of the DJI Mavic 2 Pro small drone system to highlight the forensic challenges it presents. Subsequently, we review existing forensic methods and evaluate their applicability while clarifying the data sources for drone forensics. Moreover, we propose a model for extracting data from drones.
Using the DJI Mavic 2 Pro as a case study, we analyze the structure of the extracted file and proceed to conduct three case studies on different file types (DAT files, TXT files, and default files). Subsequently, we design the parsing process of extracted files and further develop a specialized forensic tool known as the drone data parser (DRDP).
Finally, we verify the applicability of evidence-collection methods and tools through experiments. Notably, we successfully parse encrypted data from the drone in question, which emphasizes the effectiveness of the proposed forensic methods and tools.
A multi-pronged research approach is deployed to ensure a comprehensive understanding of drone forensics and validation of the forensic process. This approach combines a literature review, empirical examination, and case study analysis. The study’s results have the potential to significantly impact academic research and practical applications. Furthermore, they may provide law enforcement agencies with effective tools for investigating drone-related incidents and contribute to the advancement of research on drone forensics technology within the field of drone forensics.
This paper is organized as follows:
Section 2 starts with a summary of previous research on drone forensics. Following this,
Section 3 details the experimental setup and the data extraction model used for analyzing drones. In this section, the analysis of data retrieved from storage devices is presented, along with an explanation of the drone’s proprietary file structures. Additionally, the development of the drone data parser (DRDP), a tool used for drone forensics, is described. The findings and constraints identified in our analysis are discussed in
Section 4 and
Section 5.
2. Related Work
Drone forensics is a hot topic for drone data forensics and analysis; the existing drone forensics method can be divided into theoretical forensics and data-driven forensics [
11].
Theoretical forensics focuses on model construction and data extraction. In terms of forensic model construction, Arafat al-Dhaqm et al. (2022) [
12] proposed a comprehensive forensic model for theoretical forensics, and the proposed model can investigate different types of drone data adaptively. Fahad Mazaed Alotaibi et al. [
13] proposed a comprehensive collection and analysis forensic model for drone forensics. By data-interdependent processes and evidence reconstruction, the proposed drone forensics model outperforms other forensics models in terms of standardizing evidence collection. However, the aforementioned forensic models are not directly connected with forensic procedures, leading to performance degradation in drone forensics and providing limited assistance in real cases.
In terms of data extraction, Qingyi Tian and Baoshun Li (2017) [
14] extracted and analyzed the flight data of the DJI Phantom 3 from the internal TF card, mobile terminal APP, and external TF card. The flight data could be obtained directly since the DJI drone data from the mobile terminal APP were not encrypted or encoded. Therefore, this study is not applicable to encrypted data scenarios. Zijun Yan, Mingyu Fan, et al. (2017) [
15] have proposed solutions for identifying drone users and monitoring their behavior, including a triple identity verification mechanism and a user threat decision model. The drone data used in this research was also not encrypted. Therefore, these solutions have not yet been applied to actual cases due to the process of drone data extraction being based on unencrypted data.
Data-driven forensics focuses on extracting data from drones and related devices. In terms of drone data forensics, Maryam Yousef et al. [
16] used iOS-based smartphone devices to conduct a forensic investigation of Unmanned Air Systems, specifically the DJI Mavic Air. Hana Bouafif et al. [
17] conducted forensic analysis on the Parrot AR drone version 2.0 and found that the internal file system of the drone can be connected and accessed through the FTP protocol, which is the commonality between drone forensics and other equipment forensics, and summarized the differences of drone forensics. Maryam Yousef [
18] analyzed data extracted from four types of drones. The applicability and related functions of several commercial and open-source forensic tools are also compared. Ravin Kumar et al. [
19] extracted and analyzed GPS data from three different families of drones and visualized the location information. This work introduced a new utility called FlyLog Converter Tool, which can process and convert the Parrot drone’s flight log from a “.txt/.json” format to an easy-to-understand “.csv” format. Moreover, Thomas Edward Allen Barton et al. compared the operational variations of the Drone DJI Phantom 3 and the AR Drone 2.0. They discovered that log information and flight logs are commonly stored in the data of the controller application.
In addition to investigating and analyzing drone data, the data from relevant devices also need to be analyzed. Farkhund Iqbal et al. [
20] used a variety of smartphone devices, to conduct the forensic analysis of unmanned aerial systems in different operating systems. In addition, the author utilized various DJI tools and software to perform logical backups of the specified mobile phone models. They then conducted forensic analysis on the drone data within the backup files, revealing a substantial amount of forensic data within the DJI Phantom 4 APP. Graeme Horsman et al. [
21] extracted and analyzed data from the Parrot Bebop drone and FreeFlight3 mobile terminal control software, providing acquisition and analysis instructions for the internal storage of the device, as well as on-board flight data, media data, and operating system-related information.
The existing data-driven forensic methods have analyzed the flight logs of drones and other devices. They rely on existing forensics tools to analyze GPS data and visualize the flight path. However, these studies have not delved into the specific file structure and encryption methods of drones, which hinders further research. In conclusion, there is an urgent need to address the research problem of extracting and analyzing the file structure and encryption methods of drones. Therefore, we propose a drone data parser (DRDP) to extract data from the drones involved in criminal cases and conduct three case studies on the structure and encryption method of DAT, TXT, and default files, which provide detailed guidance for decrypting drone data and technical support for restoring the truth of drone cases in the future.
4. Three Cases of Digital Forensics on DJI Mavic 2 Pro
Since the firmware update in July 2017, data on the DJI drone system, including the drone body and related mobile terminal apps, have been stored in encrypted form. The data cannot be displayed directly, and the file structure is also different from before updating. Drone flight data originate from two main sources: the DAT file in the drone’s internal storage chip and the DAT, TXT, and default files in the mobile terminal application DJI GO 4. The drone’s internal chip allows communication and data exchange with external networks, while the mobile terminal application DJI GO 4 controls drone flight and stores data on the mobile device, which is essential in drone-related cases. The encrypted and encoded files can be decrypted and decoded to extract GPS data, motor information, flight status, and other relevant information. Additionally, the DJI GO 4 app provides relevant information about drone users, such as username, nickname, phone number, and other data. To showcase the utility of the developed procedure, three case studies were conducted using different file types (DAT files, TXT files, and default files), with the DJI Mavic 2 Pro serving as the illustration [
25].
4.1. Case 1: DAT Files Extracted from Drone and Mobile Device
- 1.
DJI_ASSISTANT_EXPORT_FILE_YYYY_MM_ DD_HH-MM-SS.DAT
The DAT file was analyzed, and “DJI_ASSISTANT_EXPORT_FILE_YYYY_MM_DD _HH-MM-SS.DAT” was decompressed using zlib. The 283 header bytes were removed, and the remaining content was segmented into 256-byte DAT header information and N flight records, each referred to as a record. The structure of the DAT file repeats until all records are concluded. It is important to note that records can have multiple data interpretations depending on the type. The length of each DAT file varies depending on the written data type, but they all follow a common structure. Each record begins with a fixed starting value of 0 × 55 and ends with the last 2 bytes of check digits. After analysis, we determined that the length of a record can range from 14 to 245 bytes. In summary, the structure of the export DAT file is shown in
Figure 3.
Each record begins with 0x55. The second byte represents the record length, while the fifth and sixth bytes indicate the record type. The seventh to tenth bytes are the serial number, referred to as the ticket_number. The last two bytes are check digits, and the remaining bytes make up the XOR-encrypted data, known as the payload. Upon analysis, it was discovered that the payload is encrypted using the XOR encryption algorithm. XOR encryption is a cryptographic technique that uses XOR calculations in computing. The XOR calculation principle states that identical values yield 0, and differing values yield 1. In other words, false is returned when the two values are the same; otherwise, true is returned. The XOR operator is used to determine whether two values are different, and it is commonly used for information encryption [
26]. In this case, we need to decrypt the XOR encryption algorithm used in the payload to reveal the plaintext information.
Through reverse analysis, it was discovered that each hexadecimal value of the payload in every record of the DAT file is XOR-encrypted with the value of “ticket_number%256” (hereinafter referred to as the key). We select one of the records for analysis; the specific structure of the record is shown in
Figure 4.
The length of this record is 0x54, and the record type is 0x0830, which stores information such as flight time, longitude, latitude, height, and velocity components (x, y, z). The ticket_number is 0x03EF1D7C, and the payload encryption information ranges from 0x9BB4 to 0x9BFD, with the check digit being 0xB07E. Upon calculation, this record’s ticket_number value is 0x7C. The result was obtained after XOR decryption:
payload Cipher text: 7A 41 48 7D 98 9C 7D 7C 2F 60 22 34 1A 53 E9 6E 2D 2B 7C7C 7C 7C C4 BD 7C 7C 58 3E 7C 7C 9C BC
each payload Cipher text ⊕ ticket_number%256 (0x7C)
payload in plaintext: 06 3D 34 01 E4 E0 01 00 53 1C 5E 48 66 2F 95 12 51 5700 00 00 00 b8 c1 00 00 24 42 00 00 e0 c0
The payload plaintext is XOR decrypted first, then the initial 32 bytes are grouped in sets of 4 bytes and read in little-endian order. The results obtained are shown in
Table 2.
The check value of each record is located in the last two bytes and is generated using the “crc16-ccitt” check method. A Cyclic Redundancy Check (CRC) is a form of channel coding that generates a compact verification code based on data such as network packets or computer files. It is primarily employed to identify errors that could occur after data transmission or storage. The “crc16-ccitt” is used to verify the DAT file and determine whether its data has been altered.
- 2.
YY_MM_DD_[HH-MM-SS]_FLY###.DAT
The file “YY_MM_DD_[HH-MM-SS]_FLY###.DAT” has identical information and structure to the “DJI_ASSISTANT_EXPORT_FILE_YYYY_MM_DD_HH-MM-SS.DAT” file after decompression and removal of the 283-byte zlib header. The plaintext result can be obtained using the decryption method mentioned above, and further details will not be provided.
4.2. Case 2: The TXT File Extracted from Mobile Device
The file “DJIFlightRecord_YYYY_MM_DD_[HH-MM-SS].txt” is generated by the software DJI GO 4 [
24]. Its structure includes header information, record, and drone attribute details. The 100-byte header information includes the start and end positions of the record, detailed information, and the file version number. Apart from the 100-byte header information and the trailing N-byte (variable length) detailed information, the central part is referred to as a record, structured similarly to a DAT file. This contains flight record time, latitude and longitude, flight altitude, and various serial numbers of the drone. The detailed information includes attributes of the drone flight, such as the model, serial number, location city, and the number of pictures and videos taken during the flight. Some attribute information is presented in plaintext. It is possible to obtain a lot of information directly from the last N bytes of the TXT file, such as the aircraft serial number, battery serial number, and camera serial number.
The flight data saved as TXT files using DJI GO 4 is organized by record type. Record type 0x01 includes information such as longitude, latitude, and flight altitude, while record type 0x05 contains data on current flight time, speed, and distance. Similar to the DAT files, the data are encrypted and encoded. Although the verification method employed may vary, the underlying principles remain consistent.
Figure 5 illustrates the file structure of the TXT file.
Each record begins with either 0x01 or 0x05, denoting the record type (hereafter referred to as recordType). The second byte in the record signifies the record’s length, while the final byte is always 0xFF, indicating the end of the record. The remaining bytes contain XOR-encrypted data (referred to as the payload). After conducting a retrospective analysis, it was discovered that the data in the TXT file is encrypted using the recordType and the first byte of the payload (referred to as keyByte) to generate the crc64 check value (named scrambleBytes). Following this, an XOR encryption operation is carried out on each hexadecimal value in the payload and scrambleBytes. We select one of the records for analysis; the specific structure of the record is shown in
Figure 6.
The first record type is 0x01, which stores information such as longitude, latitude, and flight altitude. The length of the record is 0x39, with the keyByte being 0x04. The payload encryption information for this record spans from 0x034E to 0x0361. Upon calculation, the scrambleBytes for this record are 0x9A02DFAC590F5202. The final result is obtained after XOR decryption:
payload Cipher text: 34 D2 A7 95 04 98 26 AB 10 9D 84 E3 45 02 C7 D4 D3 5F
scrambleBytes: 0x9A02DFAC590F5202
each payload Cipher text ⊕ 0x9A02DFAC590F5202
payload in plaintext: E7 8D 95 4B D1 F3 00 40 C3 C2 B6 3D 90 69 E1 3F 00 00
The second record type is 0x05, which stores information such as flight time, flight speed, and flight distance. The length of the record is 0x14, with the keyByte being 0x05, and the payload encryption information spans from 0x0312 to 0x0349. Upon calculation, the scrambleBytes of this record are 0xD35F32DED56B26EB. The result is obtained after XOR decryption:
payload Cipher text: 9A 02 DF AC 59 0F 52 02 9A 02 BA C4 EC 37 21 03 9A 02
scrambleBytes: 0xD35F32DED56B26EB
each payload Cipher text ⊕ 0xD35F32DED56B26EB
payload in plaintext: 00 00 00 00 00 00 00 00 00 00 65 68 B5 38 73 01 00 00
After XOR decryption, the plaintext payload can be formatted into groups of either 8 + 8 + 2 bytes or 4 + 4 + 2 + 8 bytes. Interpreting each data type individually, the plaintext payload information can be extracted. The results are presented in
Table 3.
The detailed information provided includes attributes of the drone, such as its name “mavic pro se” and serial number “08RDE4E00102YL”. Additionally, it encompasses information regarding the maximum flight altitude, maximum flight speed, number of photos, and other details found in the TXT file.
4.3. Case 3: The Default File Extracted from Mobile Device
The “mmkv.default” file was examined, and it was found to use base64 encoding. Base64 encoding involves converting the ASCII code value of the original text into an 8-bit binary number. The number is grouped and converted into decimal numbers, which are matched with corresponding characters from the base64 encoding table to complete the encoding process. It is important to note that, when data length is not a multiple of 3, padding is introduced to the final encoded data using the “=” character as the padding character. A portion of the “mmkv.default” file is shown in
Figure 7:
By parsing the “mmkv.default” file in the key-value storage format, during the decoding process, the value part was base64 decoded. Following the attribute record storage structure, the drone account nickname was located in the file, as highlighted in the yellow section of
Figure 4. Specifically, the value “key_account_nickname” is represented by 0x06B2 0x06C5, while 0x06B1 indicates the length of the value 0x14. Furthermore, the hexadecimal value 0x06C7 indicates the length of the value 0x0C. The encoded value from 0x06C8 to 0x06D3 is encoded in base64 as “ZGppdGVzdA==”, which decodes to “djitest”. By decoding the contents of the “mmkv.default” file systematically and interpreting them in key-value format, user information such as user ID, nickname, phone number, and region can be extracted.
The extraction analysis of DJI Mavic 2 Pro retrieved useful forensic data, including GPS data, user details, dates, and other information, which were obtained from various sources and encrypted using different methods. Forensically relevant drone information is summarized in
Table 4. To facilitate a clear and concise illustration of the recommended data acquisition process for investigators, we advocate utilizing the flow diagram presented in
Figure 8.
5. Result Testing and Discussion
This section outlines the tests performed to verify the feasibility of the forensic program DRDP. Nine sets of “DJI_ASSISTANT_EXPORT_FILE_YYYY_MM_DD_HH-MM-SS.DAT” files were exported and then chosen for testing. Correspondingly, nine “YY_MM_DD_[HH-MM-SS]_FLY###.DAT” and nine “DJIFlightRecord_YYYY_MM_DD_[HH-MM-SS].txt” files were extracted from the mobile terminal for joint testing. The DRDP successfully parsed nine data sets.
Table 5 presents part of the parsing results from the CSV file. The velocity components (x, y, z) were not found in the TXT file and
Table 6 displays the DAT-parsed results from the CSV file.
The flight record is obtained by processing the last N bytes of information from the TXT file through the rstrip() function and transcoding it into a string. This record contains important information such as the flight start time, starting GPS location, total distance and duration of the flight, maximum altitude reached, and number of photos taken.
The DRDP tool executes the command “python main.py -u” to parse user information. It accurately extracts the user ID, nickname, phone number, and region of drone users, as shown in
Table 7.
User information is analyzed to enhance the completeness of the evidence chain and provide additional insights into suspects. This information can be corroborated and reinforced by other evidence. Additionally, the author made a significant discovery. Upon analyzing the drone DAT data, it was observed that the earliest flight record file was incomplete, retaining only a limited amount of information. After conducting multiple export and analysis tests, it has been determined that the drone continuously writes new flight data when powered on. Once the storage space is full, previous data are overwritten and cannot be recovered. Therefore, in practical forensic activities, forensic personnel should promptly export the data and power down the drone equipment to prevent the overwriting of flight data and potential data loss.
The research is restricted to the DJI Mavic 2 Pro, and therefore does not provide a comprehensive understanding of all consumer drone forensics. Although some scholars are conducting relevant research on drone user information, the specific decryption algorithms and file structures have not been announced; we make up for this and serve as a solid foundation for subsequent drone forensics. Due to the significant time and energy costs associated with forensic analysis research, it is impractical to cover all or even most consumer drones. Our ongoing efforts involve continuous research in the field of drone forensics and collaboration with governments to encourage drone manufacturers to establish agreed-upon data standards for flight recordings. The forensic tool DRDP faces a major limitation due to the specialized data formats of DAT and TXT files. This work acknowledges the inability to capture the full spectrum of potential data for exhaustive parsing from relevant files and recognizes the limitation of achieving exhaustive parsing for all data.
By comparing the drone’s flight data with those of the mobile device, we can verify that the flight data originate from the Android phone controlling the DJI drone. However, the altitude value in the TXT file significantly differs from the value in the DAT file, and the specific reason for this discrepancy remains unclear. This discrepancy will serve as the starting point for our next phase of research. Our future work will focus on further exploring the structures of the DAT and TXT files, as well as related files on the device, such as “mmkv.default”, which may contain important information. Additionally, we have attempted to reverse engineer the drone’s firmware, but more work is needed to illuminate data that cannot be parsed in the current analysis. In the future, further research should be conducted in other areas related to drones where reliable data can be obtained.
6. Conclusions and Future Work
In this paper, a drone data parser (DRDP) was designed to analyze the file structure and encryption method of the DJI Mavic 2 Pro. The proposed method can parse GPS information, flight time, altitude, distance, and velocity components (x, y, z) effectively, enabling the accurate analysis of the drone’s flight status and providing robust support for the truth of cases. The proposed DRDP is verified on real-world drone data files to confirm its effectiveness, including DAT files, TXT files, and default files. The experiments show that the proposed DRDP can successfully parse the internal and external data of drones.
In future work, we plan to further explore more features of drones’ data from DAT and TXT files. This expanded analysis will encompass the Euler angle data (pitch, yaw, and rotation) associated with the drone to provide a detailed depiction of the drone’s state. Moreover, we aim to delve into the reverse engineering of drone firmware to ascertain the potential for inferring plain data when encryption methods differ. Furthermore, we intend to extend the current research scope by exploring a wider range of drone models and manufacturers.