A Multi-Modal Benchmark Dataset for UAV Wireless Communication Research

Alibabaie, Najmeh; Calabrò, Antonello; Marchetti, Eda

doi:10.3390/drones10040244

Open AccessArticle

A Multi-Modal Benchmark Dataset for UAV Wireless Communication Research

by

Najmeh Alibabaie

^*

,

Antonello Calabrò

and

Eda Marchetti

Istituto di Scienza e Tecnologie dell’Informazione “A.Faedo”, Consiglio Nazionale delle Ricerche (CNR-ISTI), 56124 Pisa, Italy

^*

Author to whom correspondence should be addressed.

Drones 2026, 10(4), 244; https://doi.org/10.3390/drones10040244

Submission received: 13 February 2026 / Revised: 18 March 2026 / Accepted: 23 March 2026 / Published: 27 March 2026

(This article belongs to the Section Drone Communications)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

We present a large-scale, geometry-aware UAV communication dataset spanning rural, suburban, and urban regions, with static and mobile scenarios across Sub-6 GHz, MMWave, and NB-IoT bands, generated using standardized 3GPP and ITU-R channel models.
The dataset integrates multi-modal information, including node-level metadata, link-level statistics, fine-grained multipath parameters, encoded ray–interaction sequences, and multi-format 3D geometries, with validated cross-source alignment in a representative urban region, showing sub-meter mean centroid agreement.

What are the implications of the main findings?

The structured combination of communication and environment modalities enables reproducible benchmarking for channel modeling, ML-based propagation prediction, and LOS/NLOS analysis in UAV networks.
By providing open-access, geometry-aligned, and ML-compatible data without requiring proprietary simulation tools, the dataset lowers entry barriers and establishes a standardized foundation for transparent and extensible UAV communication research.

Abstract

Data-centric approaches are increasingly shaping wireless communication research, where the availability and quality of datasets directly influence the reliability of learning-based and model-driven methods. In this context, unmanned aerial vehicle (UAV) communication poses unique challenges, as it requires datasets that jointly capture geometric information, propagation conditions, and diverse link configurations. This work introduces a geometry-aware UAV communication dataset designed to support research on controlled UAV communication link directions and propagation scenarios. The dataset is generated using standardized 3GPP and ITU-R channel models across multiple urban, suburban, and rural regions, accounting for variations in altitude, carrier frequency, and node distribution. The dataset provides spatially resolved channel parameters along with geometry-rich files containing environmental features, which can be used to extract relevant parameters for UAV communication studies. These data support reproducible research in geometry-aware channel modelling, path-loss prediction, LOS/NLOS analysis, delay-related modelling, and trajectory-conditioned link-quality analysis.

Keywords:

dataset generation; unmanned aerial vehicles; geometrical features; channel modelling; ray tracing

1. Introduction

Wireless communication systems today span a diverse range of platforms, from terrestrial infrastructure to aerial and spaceborne systems. Among these, UAV-based communication has gained increasing attention due to its growing applications in areas such as remote sensing, environmental monitoring [1], package delivery [2] and next-generation mobile networks [3]. Regardless of the specific use case, one common requirement is access to reliable and multi-modal datasets to evaluate system performance under various conditions, including the effect of different geometrical features on communication performance.

Existing datasets in UAV communication vary widely in terms of frequency bands, mobility patterns, channel models, and environmental settings; while valuable, these datasets are often developed with specific assumptions and lack a unified structure or benchmarking standard [4]. This fragmentation comes from the natural variability of system configurations, spatial environments, and system goals. Moreover, no dataset can fully capture the spatial and temporal dynamics inherent in real-world UAV communication scenarios.

On the other hand, while traditional wireless communications have been studied for decades, UAVs are a recent addition, especially in the context of 5G and beyond networks. As such, the field is still evolving, with much of the focus on designing new communication protocols, channel models, and techniques, while efforts to collect and share comprehensive datasets are comparatively less mature. In addition, conducting large-scale measurement campaigns for UAV communication is often constrained by regulatory and safety considerations. Acquiring flight permissions, especially in dense urban or metropolitan areas, typically requires coordination with civil aviation authorities, making real-world data collection time-consuming and difficult to scale [5].

This lack of readily available datasets makes it difficult to evaluate new methods and compare them to state-of-the-art solutions [6]. As a result, researchers are increasingly turning to synthetic datasets. However, many studies rely on proprietary or unpublished datasets, limiting reproducibility and comparability. Custom-built datasets, which are often poorly documented or not publicly accessible, further hinder progress by preventing researchers from reproducing results or building on each other’s work. A public, standardized dataset fosters transparency, ensures comparability across studies, and supports meaningful method comparisons. Furthermore, many existing datasets are either limited in scope or incompatible with machine learning (ML) frameworks.

In UAV communication, these challenges are compounded by the need for detailed geometric data to accurately model the communication environment. UAV communication channels are inherently shaped by the surrounding 3D environment [7].

However, extracting relevant geometrical features to model UAV communication channels is not a straightforward task. The geometrical features that influence communication vary depending on the research focus, whether it’s building heights, terrain features, or object boundaries. These features require different geometric representations: mesh faces, line segments, corner points, and 3D surface models. As a result, no single geometrical file format is universally suitable for all UAV communication studies, as each format captures a unique aspect of the environment.

Additionally, obtaining these geometrical features often requires multiple data sources and platforms [8]. Researchers must navigate them to assemble geometrical data. Despite the importance of geometrical data in UAV communication research, no effort has been made to provide a multi-modal dataset that covers a wide range of geometrical file formats and data sources.

Commercial software dependencies and licensing restrictions further hinder researchers, especially those in academia or smaller labs with limited resources, from obtaining multi-modal datasets to test and evaluate their proposed techniques [9].

To address these limitations, we introduce an open-access benchmark dataset for geometry-aware UAV wireless communication research. The contribution of this work is not merely the rapid construction of Wireless InSite (https://www.remcom.com/wireless-insite-propagation-software, (accessed on 22 March 2026)) scenarios from heterogeneous geospatial sources. Rather, the dataset explicitly aligns complementary environment modalities, including OSM, GeoJSON, DXF, OBJ, and STL-derived representations, with communication modalities spanning node-level metadata, link-level statistics, path-level descriptors, and ray-interaction sequences. This organization enables researchers to study how different geometric abstractions relate to propagation behavior and to develop reproducible, ML-ready benchmarks without requiring access to proprietary ray-tracing software.

The main contributions of this work are as follows:

Provide a geometry-aware benchmark dataset that jointly exposes aligned environment modalities and communication modalities at node, link, path, and interaction levels.
Cover static and mobile UAV communication scenarios across rural, suburban, and urban regions, and across 2 GHz, 3.5 GHz, and 28 GHz configurations designed in accordance with 3GPP and ITU-R guidance.
Structure the dataset in an ML-ready and openly accessible format that preserves explicit correspondences between geometry sources, node metadata, and propagation outputs.
Enable controlled benchmark tasks such as path-loss prediction, LOS/NLOS/Blocked classification, delay-related regression, and trajectory-conditioned link-quality analysis.
Document the scope and limitations of the current release, including its simulation-based nature, omnidirectional antenna baseline, and the absence of interference and network-level dynamics, while outlining clear directions for future extensions.

The rest of the paper is organized as follows: Section 2 reviews the available datasets related to UAV communications. Section 3 presents the simulation setup used for data generation. Section 4 provides a detailed description of the constructed dataset and its contents, while Section 5 shows how the released modalities can be used. Finally, Section 6 concludes the paper and discusses potential directions for future work.

2. Available Datasets

Wireless communication involving UAVs comprises a wide range of research and application domains. Datasets supporting these investigations can generally be divided into two major categories: real-world datasets and synthetic datasets. Each serves different roles in the development and evaluation of UAV communication systems. However, many of these datasets are not publicly available or lack the standardization needed for reproducible research [10,11].

Real datasets are collected through physical UAV deployments using actual communication hardware. These datasets reflect authentic wireless propagation phenomena. Their primary advantage lies in their fidelity, making them ideal for validating algorithms in controlled conditions. However, creating real datasets requires significant resources, including substantial investment in equipment, staff, and time for processing [12]. These datasets are also usually limited in size and flexibility. Changing things often means repeating the entire process, which is expensive and time-consuming.

In contrast, synthetic datasets are generated through simulation platforms that model UAV dynamics and wireless propagation environments [13]. These simulations can incorporate a wide variety of configurations and environmental conditions. Synthetic datasets offer high scalability and flexibility, enabling researchers to generate large quantities of labeled data across diverse scenarios.

However, the realism of synthetic datasets is constrained by the fidelity of the underlying simulation tools. Hardware impairments, unpredictable interference, and real-world noise behavior are difficult to reproduce accurately.

Because wireless propagation is inherently geometry-dependent, even slight changes in the spatial relationship between transmitter and receiver, such as those introduced by terrain, building layout, or atmospheric conditions, can yield substantially different channel behaviors [4]. As a result, the space of possible UAV communication scenarios is vast, and no single dataset can comprehensively cover all relevant conditions.

Several publicly available datasets support UAV-related wireless communication research, each offering different characteristics and use cases. The Aerial Experimentation and Research Platform for Advanced Wireless (AERPAW) serves as a testbed for aerial wireless experimentation, integrating drones, helikites, and software-defined radios [4].

AERPAW Dataset-12 [14] is a real-world rural air-to-ground (A2G) measurement set that includes in-phase and quadrature (IQ) samples, Global Positioning System (GPS) data, and received signal strength (RSS) values, suitable primarily for path loss modeling under line-of-sight (LOS) conditions. AERPAW Dataset-19 [15] expands on this by including channel impulse response (CIR) data and multipath information, enabling analysis of multipath propagation effects in rural LOS environments.

The MaMIMO UAV dataset [16] provides real channel state information (CSI) measurements collected in an outdoor campus environment. It includes spatial metadata such as UAV positions and trajectories, making it suitable for tasks such as beamforming, spatial channel modeling, and trajectory-based channel adaptation. However, it does not include raw IQ samples.

The dataset in [17] focuses on UAV-to-UAV communication in open-field LOS conditions at millimeter-wave (mmWave) frequencies. It provides directional channel information through beam scans, signal-to-noise ratio (SNR), RSS indicator, and received power, but lacks raw time-domain signals and rich multipath. It is particularly suitable for evaluating beam alignment performance and modeling directional mmWave path loss in hovering UAV setups.

In addition to these, the synthetic dataset in [18] presents OFDM signals generated under varying SNRs, delay profiles, and Doppler shifts, with ideal ground-truth channel frequency responses included. This dataset enables controlled experiments in deep learning-based channel estimation, with a particular focus on adversarial attack mitigation and robustness analysis in next-generation wireless networks; while synthetic and lacking UAV-specific mobility patterns, it is valuable for supervised training and testing of neural network channel estimators in diverse 5G and beyond scenarios.

From a broader application perspective, UAV datasets support a wide variety of research domains, each with distinct data requirements. For instance, object detection, obstacle avoidance, and mapping typically rely on image or video data captured from UAV-mounted cameras [19]. In contrast, communication-centric applications rely on tabular or signal-level datasets. Even within tabular datasets, the specific features vary widely across existing datasets, with some focusing on raw I/Q samples [14], others on CSI [20], and others still on positioning or localization metrics. This diversity highlights the need for clearly defined datasets aligned with their intended use cases. Our work focuses specifically on physical-layer wireless communication, providing detailed spatial and propagation characteristics tailored for benchmarking UAV communication systems.

Table 1 highlights that the proposed dataset integrates multi-format 3D environmental representations, link-, path-, and interaction-level channel descriptors, high-fidelity ray-tracing outputs, and mobility-aware configurations across multiple frequency bands. To the best of our knowledge, no publicly available UAV communication dataset simultaneously provides these features in a unified and extensible framework.

Taken together, the datasets discussed above provide valuable resources for specific UAV communication studies, but they typically expose only a subset of the information required for geometry-aware benchmarking. Public real-world datasets often provide measurements with limited explicit 3D environmental representations, while synthetic datasets may provide labeled signals without aligned multi-format geometry or path- and interaction-level descriptors. In contrast, the proposed dataset is designed as a benchmark resource that provides environment files, node metadata, link-level statistics, path-level parameters, and ray-interaction sequences in a unified structure. This makes the dataset suitable for controlled studies of geometry-aware propagation modeling and ML-based inference tasks that are not adequately supported by existing public resources.

3. Simulation Setup

To generate a diverse and geometry-aware dataset for UAV communication scenarios, we selected eleven distinct simulation areas representing urban, suburban, and rural environments across various geographical regions. Table 2 summarizes the environmental types, spatial boundaries, and the approximate physical dimensions of these areas.

The 3rd Generation Partnership Project (3GPP) has introduced UAV-specific enhancements in Release 15 and subsequent releases to address challenges [20]. The simulation design follows deployment guidelines and propagation models from 3GPP TR 36.777 [21], 3GPP TR 38.901 [22], and ITU-R P.1410-5 [23]. The selected parameters, summarized in Table 3, serve as the basis for defining standard-compliant simulation conditions.

Deployment Details:

-: UAV Nodes: Deployed in either a grid or random layout, with altitudes sampled from discrete values (30, 60, 90, 120 m) or continuously in the 40–110 m range. Grid deployments were used in suburban and rural scenes, while random layouts were applied in complex urban environments.
-: Ground Nodes: In most regions, a $25 \times 25$ grid layout was used, representing structured deployments. Ground nodes are primarily placed at pedestrian height (1.5 m). In urban areas with irregular topography or building density, 30–40 nodes were randomly placed at heights of 0.5–10 m to reflect handheld, vehicular, or low-mounted sensors.

To illustrate the geometrical layout and spatial characteristics of these environments, Figure 1 presents representative subfigures for selected areas.

-: Scene Sizes and UAV Counts: UAV counts were scaled proportionally to the area while ensuring a manageable simulation footprint. Smaller areas used 20–30 UAVs, while larger areas used up to 80. These values maintain consistent spatial density in line with 3GPP guidelines, while supporting vertical and geometric diversity. The node layouts were selected as controlled spatial sampling strategies rather than as exact replicas of a single operational deployment. For ground nodes, the standard $25 \times 25$ layout uniformly samples the scene’s horizontal extent and provides dense coverage for geometry-aware analysis. For a scene with width W and height H, the nominal grid spacing is given by $Δ_{x} = W / (25 - 1)$ and $Δ_{y} = H / (25 - 1)$ . As a result, the spacing adapts to the scene size while preserving the same sampling structure across environments. For example, the nominal spacing is approximately $21.5 \times 22.4$ m in Manhattan, $28.9 \times 30.6$ m in Palo Alto, and $43.4 \times 64.4$ m in Tennessee.

In Figure 2, ground nodes follow the standard

25 \times 25

grid deployment at approximately 1.5 m height, providing dense spatial sampling of the scene. UAV nodes are sampled at representative discrete altitudes (30, 60, 90, and 120 m) and along a trajectory used to generate mobile channel snapshots. Urban buildings are shown schematically to illustrate the influence of 3D morphology on line-of-sight and obstructed propagation paths.

Ground nodes in grid deployments were fixed at 1.5 m to match the standard outdoor user-equipment height commonly used in 3GPP-based evaluations. In contrast, the 0.5–10 m range adopted in irregular urban scenes was used to represent heterogeneous low-altitude endpoints, including handheld devices, vehicular terminals, and low-mounted sensors, while avoiding an artificial regular lattice in dense and morphologically complex environments.

Similarly, static UAV placements were designed to provide controlled spatial and vertical coverage. The discrete altitude set

{30, 60, 90, 120}

m spans representative low-altitude operating levels and enables investigation of how height influences clearance, LOS probability, and multipath structure. Random UAV placements in dense urban environments were used to probe cluttered morphologies and irregular street canyons more effectively than a regular lattice.

While we aimed to maintain consistency with standard configurations, our primary objective is to support research on UAV communication with a particular focus on geometric factors. To this end, certain aspects, such as inter-UAV interference, were intentionally excluded to simplify analysis and emphasize spatial characteristics. The dataset is designed to capture a wide range of parameters, including TX–RX distances and altitude variations, in line with our goal of enabling detailed studies of geometry-aware UAV communication scenarios. Table 4 provides detailed configuration per scene, including layout types, node counts, and altitude ranges.

The dataset includes simulation of communication links in three configurations: ground-to-air (G2A), air-to-ground (A2G), and air-to-air (A2A). In this study, three carrier frequencies were considered: 2 GHz, 3.5 GHz, and 28 GHz. The 2 GHz band is representative of low-frequency operation, commonly employed for NB-IoT and wide-area coverage due to its favorable propagation and penetration characteristics. The 3.5 GHz band corresponds to sub-6 GHz systems widely used in current cellular networks, while the 28 GHz band represents mmWave frequencies anticipated for next-generation UAV-enabled communications.

All wireless nodes, both ground and aerial, were equipped with omnidirectional antennas with an assumed antenna gain of 0 dBi. In our simulation, we adopted a 0 dBi antenna model as a baseline reference to provide a neutral starting point for link budget evaluation. The transmission power was configured to 23 dBm for ground nodes and 27 dBm for UAV nodes, reflecting typical values used in UAV communication system simulations.

Propagation characteristics were modeled using the Wireless InSite X3D propagation engine (Remcom Inc., State College, PA, USA), which supports detailed ray-based analysis incorporating reflection and diffraction effects. Each simulation considered up to six reflections and one diffraction per ray, with a ray spacing of 0.25 degrees. Transmissions through objects were not permitted. Table 5 summarizes the electromagnetic parameters of each material, including dielectric properties, surface roughness, and thickness where applicable.

Environmental conditions during simulations across all scenarios were held constant to minimize variability caused by atmospheric effects. The selected values for temperature, pressure, and humidity were chosen based on moderate, typical climate conditions that are generally applicable across a variety of geographical regions. An overview of all propagation-related parameters is presented in Table 6.

In this study, we consider both static and mobile UAV scenarios across rural, suburban, and urban environments. In rural and suburban scenarios, a single UAV is used, and channel parameters are collected at each timestep across the flight to capture continuous changes in the communication environment. For these mobile scenarios, the UAV platform’s properties are detailed in Table 7. In urban areas, we divide the environment into four sub-areas (Bottom Left, Bottom Right, Top Left, Top Right), with one UAV assigned to each sub-area. Ground nodes are similarly divided into four subsections based on the sub-areas. The UAV movement follows a grid-based trajectory, and the velocities are set according to typical UAV operational speeds. The simulation timesteps are configured to reflect continuous movement, capturing how changes in position affect channel parameters.

The mobile routes should be interpreted as controlled benchmark trajectories rather than mission-level flight plans for a specific UAV application. Their purpose is to traverse representative portions of each scene and induce systematic variations in transmitter–receiver distance, altitude clearance, and LOS/NLOS transitions. The use of constant-speed, grid-like motion was intentional, as it decouples geometry-dependent propagation effects from higher-level flight-control policies. Accordingly, the dataset is well-suited to trajectory-conditioned link analysis and geometry-aware learning, but it does not aim to reproduce the full flight mechanics, battery constraints, or obstacle-avoidance behavior of a particular logistics or inspection UAV platform.

For mobile scenarios, the UAV route is discretized into spatial increments of 20 m. Given the constant speed of 20 m/s and the absence of acceleration or deceleration, each increment corresponds to a temporal step of

Δ t = Δ s / v s . = 1

s. Therefore, the k-th channel snapshot is aligned with the k-th discrete UAV pose and timestamp

t_{k} = k Δ t

. All link-, path-, and interaction-level parameters stored for mobile cases refer to the instantaneous geometry at that sampled pose. This design provides a controlled sequence of geometry-conditioned channel snapshots along the route, rather than a continuous-time flight-dynamics model.

After preparing the channel parameters, the next step is to set up the environmental modality for UAV communication simulations. In this study, we focused on several key features commonly studied in UAV communication research, including building heights, building face and vertex counts, density ratios, distances between nodes, and terrain elevation. These features are not exhaustive, as ongoing research in geometry-aware communication continues to identify new parameters that significantly influence channel characteristics.

To support the extraction of these features, we utilized multiple file formats based on their ability to represent complex environmental geometries. Specifically, we employed STL and OBJ formats, which allow for detailed representation of 3D surface features and structural elements, including building facades, rooftops, and obstacles.

The advantage of the proposed multi-source construction is not limited to faster scenario preparation in Wireless InSite. Each format captures a distinct geometric abstraction useful for different classes of downstream analysis. OSM and GeoJSON provide map- and footprint-level context, DXF preserves CAD-like primitives and explicit geometric elements, and OBJ/STL expose surface-based 3D representations suitable for mesh-level feature extraction. By aligning these sources in a common ENU reference frame and linking them to the same communication instances, the dataset enables cross-modal feature engineering and reproducible benchmarking.

In principle, a pre-built Wireless InSite scenario could be used as a starting point for multimodal dataset construction. However, a simulation project alone does not necessarily expose an open, modality-explicit, ML-ready organization of geometry files, node metadata, link-level statistics, path descriptors, and ray-interaction sequences. The contribution of this work lies in explicitly packaging and aligning these complementary modalities for benchmark-oriented use.

In contrast to basic geometric representations (e.g., cubes or rectangles), we also incorporated DXF (Drawing Exchange Format) files, which are 3D CAD models offering with a high level of detail. DXF files capture complex building geometries by preserving geometric elements like LINE, POLYLINE, and 3DFACE, with explicit XYZ coordinates for their vertices.

Additionally, to enhance the environmental modeling, we utilized OpenStreetMap (OSM) data for building footprints. Given that OSM data typically lacks building height information, we supplemented it with GeoJSON files derived from Google Earth and gee-community-catalog.org https://gee-community-catalog.org/ (accessed on 6 March 2026), which provide accurate geometry-aware terrain elevation and obstacle data. The gee-community-catalog.org offers standardized geospatial datasets, which were used to extract terrain elevation and other environmental features necessary for comprehensive multi-modal channel modeling.

Scope and Limitations of the Current Release

The proposed dataset is entirely simulation-based and should be viewed as a controlled benchmark resource rather than as a replacement for dedicated field measurements. Its realism depends on the fidelity of the ray-tracing engine, the accuracy of the underlying geographic and geometric sources, the assigned material properties, and the assumptions inherited from the adopted standards and propagation settings. For this reason, the dataset is particularly suitable for comparative and geometry-conditioned studies, while direct transfer to deployment-specific real-world performance should be interpreted with appropriate caution.

Several simplifying assumptions were introduced deliberately. First, all nodes employ omnidirectional antennas with 0 dBi gain so that the current release isolates geometry- and propagation-induced effects without additional gains from beam steering, polarization, or antenna tilting. Second, inter-UAV interference, scheduling effects, and broader network-level dynamics were excluded in order to focus the benchmark on link-level and geometry-aware channel behavior. As a result, the dataset is not intended for dense multi-UAV interference studies, interference-aware communication strategies, or NTN/system-level evaluations in its current form. Third, atmospheric parameters were fixed across scenarios to reduce confounding variability during dataset generation. Future extensions may incorporate directional radiation patterns, beamforming configurations, polarization effects, network-level interactions, and measurement-aligned validation.

4. Dataset Description

The dataset for each simulation area consists of five structured CSV files, each capturing a distinct layer of information for UAV communication modeling. These include node placement data, link-level statistics, detailed path-level features, ray-traced interaction sequences, and geometry-rich files describing the environment’s spatial characteristics.

The geometrical part of the Environment Model provides crucial information about the physical structure and layout of the environment in which communication takes place. These features are represented by 3D geometries in formats such as Geojson, STL, OBJ, and DXF, which describe the terrain, buildings, and other obstacles affecting the propagation paths.

In the communication modal:

the first two files contain spatial metadata for aerial and ground nodes. For UAVs and ground users, each entry includes a unique identifier, geographic coordinates (latitude, longitude), and corresponding local Cartesian coordinates (x, y, z).
The third file provides link-level propagation characteristics between each TX–RX pair. The recorded parameters include the total number of resolved paths, the received signal power in dBm, the mean time of arrival (ToA), the delay spread, and the status (LOS, NLOS, and Blocked), all of which are key descriptors in channel modeling.
The fourth file offers fine-grained path-level data for each link, listing each resolved multipath component separately. Features include the number of interactions (excluding transmitter and receiver), received power, signal phase, exact ToA, and angular properties, including arrival and departure angles in both the azimuth and elevation planes. Additionally, each path is tagged with an encoded interaction pattern that describes the physical phenomena encountered.
Finally, the fifth file lists 3D ray interaction points for each propagation path. Each path is expressed as a sequence of discrete interaction events, such as reflections, diffractions, or foliage penetration, each represented by a unique integer code. These codes follow the scheme shown in Table 8. For instance, an interaction pattern of 1429 denotes a path that begins at the transmitter, proceeds through diffraction (4), reflection (2), and ends at the receiver (9). Code 3 is maintained in the generic interaction-encoding schema for future extensibility and compatibility. It is not included in the current release because object transmission was disabled in the ray-tracing configuration.

To make the multimodal structure of the dataset explicit, Table 9 summarizes the relationship between the released file formats, the information they represent, and their typical downstream use in geometry-aware UAV wireless communication research.

As shown in Table 9, the dataset is multimodal not only because it includes multiple file formats, but because it aligns complementary environment and communication representations within a common benchmark structure. This organization enables studies that connect geometric abstractions of the scene with channel behavior at the node-, link-, path-, and interaction-levels.

This explicit alignment distinguishes the proposed dataset from simulation project files that may contain a scenario definition but do not necessarily expose an open, modality-aware, ML-ready organization of geometry and propagation descriptors.

To provide a multi-level overview of the dataset, Table 10 summarizes the number of samples collected for each simulation scenario. The statistics are divided by environment, mobility model, and communication frequency. The number of samples reflects the variation in simulation conditions, including geographic locations and node configurations.

The dataset is structured into three main regions, Rural, Suburban, and Urban, and provides both Communication Modal and Environment Modal data. The Communication Modal includes simulations of UAV communication channels, divided into Static (fixed positions) and Mobile (moving nodes) categories. Each category contains different communication scenarios based on frequency bands. The Environment Modal contains 3D environmental data for each scenario.

To complement the dataset’s structural description and file-level organization, we also report a small set of aggregate descriptive summaries derived from the sample statistics of the released benchmark instances (Figure 3, Figure 4 and Figure 5). These summaries are not intended as a full performance evaluation; rather, they provide a compact view of how propagation richness and interaction complexity vary across environments, frequency bands, and link directions. In this way, they help make the benchmark’s internal structure more explicit and visually support the interpretation of the node-, link-, path-, and interaction-level data released with the dataset.

As shown in Figure 3, Figure 4 and Figure 5, the released benchmark instances exhibit meaningful variability across environments, frequency bands, link directions, and mobility settings. This descriptive heterogeneity is important because it indicates that the dataset is not a homogeneous collection of channel samples, but a structured benchmark spanning distinct propagation conditions. The observed trends also motivate releasing descriptors at multiple levels, since node- and link-level summaries alone do not fully capture the path richness and interaction complexity in the benchmark. In this sense, the added plots help clarify the intended use of the dataset for comparative benchmarking and geometry-aware learning tasks.

Cross-Source Geometry Alignment and Validation

To ensure the reliability of the multi-source dataset, we performed a quantitative validation of building geometries obtained from different platforms. Specifically, building footprints from a DXF model (exported from Blender) were compared with GeoJSON polygons sourced from gee-community-catalog.org covering the same urban region. Both datasets were transformed into a local East-North-Up (ENU) coordinate frame centered at the Wireless InSite simulation origin.

A nearest-matching procedure identified the best correspondence for each building. To evaluate cross-source geometric consistency, we computed three similarity metrics for each building: Intersection-over-Union (IoU) of building footprints, centroid displacement, and footprint area ratio.

Table 11 highlights that the vast majority of buildings (97%) exhibit strong agreement between DXF and GeoJSON footprints, indicating that the dataset metadata provides reliable correspondence across sources. The few Shifted Matches (1.3%) reflect minor spatial offsets, typically less than 10 meters, which could be attributed to differences in coordinate precision, building simplification in the DXF model, or slight variations in polygon delineation. Only 9 buildings (1.6%) lacked a match, which may be due to missing structures or inconsistencies between the data sources.

The mean IoU of 0.86 indicates that the overlapping area between corresponding polygons is highly consistent, while the near-unity mean footprint area ratio (0.997) confirms that building sizes are preserved across sources. The small mean centroid displacement of 0.74 m further validates that positional accuracy is sufficient for UAV channel modeling applications, including LOS determination, diffraction edge identification, and ML-based propagation prediction.

This cross-source comparison highlights a common challenge in geometry-aware UAV communication research: no single data source provides all building features with perfect accuracy, and discrepancies across platforms are unavoidable due to differences in coordinate systems, polygon simplification, or modeling conventions. While absolute alignment across sources is not guaranteed, we included multi-modal metadata linking buildings across formats, enabling researchers to confidently fuse multi-source geometries for feature extraction. Despite these limitations, the dataset achieves sub-meter alignment accuracy, preserves building footprint shapes and areas, and provides sufficient reliability for downstream UAV channel modeling tasks.

To provide a visual perspective on cross-source consistency, Figure 6 presents two complementary views side-by-side. The left panel overlays DXF (red) and GeoJSON (blue) building footprints in the ENU coordinate frame, providing a qualitative overview of spatial alignment. Areas where colors overlap indicate strong agreement, while non-overlapping regions highlight minor shifts or missing structures. The right panel colors each GeoJSON building according to its IoU with the best-matching DXF polygon offering a quantitative measure of per-building agreement. Together, these visualizations confirm that most buildings align closely, with only a few exhibiting small spatial offsets, consistent with the statistics in Table 11. This validation was conducted on a representative urban subset to quantify cross-source geometric consistency and should be interpreted as an indicator of alignment quality rather than as an exhaustive validation of every scene in the dataset.

Folder Hierarchy:

(a)

Communication Modal

i

Static

A

SUB-6

Scenarios
1
U2G / G2U / U2U
2
TX-RX metadata

B

NB-IoT

Scenarios
1
U2G / G2U / U2U
2
TX-RX metadata

C

mmWave

Scenarios
1
U2G / G2U / U2U
2
TX-RX metadata

ii

Mobile

A

Scenarios

1: U2G
2: TX-RX metadata

(b)

Environment Modal

i

Scenarios

Geometry-rich files

Each scenario folder in the Communication Modal includes specific channel parameters, while the Environment Modal provides the geometrical files for the environment’s physical layout. The dataset is available at Zenodo https://zenodo.org/ under the DOI 10.5281/zenodo.17486134.

5. Benchmark Tasks Enabled by the Dataset

The structured organization of the dataset enables several benchmark tasks for geometry-aware UAV wireless communication research without requiring proprietary simulation access or extensive preprocessing. The tasks below are intended as representative examples of how the released modalities can be used in reproducible evaluation settings.

(1) Path-loss/received-power regression. Using node metadata, transmitter–receiver distance, altitude information, environment descriptors, and link-level statistics, the dataset supports supervised regression of received power or path loss across heterogeneous environments and frequency bands.

(2) LOS/NLOS/Blocked classification. The combination of geometry-rich environment files, link-level labels, and path/interaction data makes the dataset suitable for predicting link-state categories such as LOS, NLOS, and Blocked using spatial and propagation-aware features.

(3) Delay-related regression. The path-level and link-level files include delay-related quantities, such as mean time of arrival and delay spread, enabling benchmark studies of delay prediction across different environments, link distances, and carrier frequencies.

(4) Trajectory-conditioned link-quality analysis. The mobile scenarios support sequential prediction tasks in which channel quality indicators are modeled along sampled UAV routes. This includes forecasting received power or link state along a trajectory from past observations and geometry-aware inputs.

These tasks are not intended to exhaust the possible uses of the dataset. Rather, they illustrate the benchmark-oriented design of the current release and the types of reproducible studies that it is meant to support.

6. Conclusions

In this work, we introduced an open-access, geometry-aware benchmark dataset for UAV wireless communication research. The dataset jointly exposes aligned environment and communication modalities across rural, suburban, and urban regions, spanning both static and mobile settings across multiple carrier frequencies. Its main value lies in providing a controlled and reproducible benchmark structure in which geometry, node metadata, link-level statistics, path-level descriptors, and ray-interaction sequences can be studied together.

The current release is intended primarily for link-level and geometry-aware benchmarking. As discussed in the manuscript, it is entirely simulation-based and adopts several simplifying assumptions, including omnidirectional antennas, fixed atmospheric settings, and the exclusion of interference and broader network-level dynamics. These choices were made deliberately to isolate geometry-conditioned propagation behavior and to provide a transparent baseline for comparative studies.

Future extensions of the dataset may include directional antenna patterns, beamforming and polarization settings, network-level interactions, additional environments, and stronger alignment with measurement-based data. We expect the present release to serve as a practical benchmark resource for reproducible studies on path-loss prediction, LOS/NLOS inference, delay-related modeling, and mobility-aware UAV communication analysis.

Author Contributions

Conceptualization, N.A. and A.C. and E.M. Methodology, N.A. and A.C. and E.M. Software, N.A. Validation, A.C. Formal Analysis, N.A. Investigation, N.A. Resources, A.C. and E.M. Data Curation, N.A. and A.C. Writing (Original Draft Preparation), N.A. and A.C. and E.M. Writing (Review and Editing), N.A. and A.C. and E.M. Visualization, N.A. Supervision, E.M. Project Administration, E.M. Funding Acquisition, E.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The dataset is available at Zenodo under the DOI 10.5281/zenodo.17486134.

Acknowledgments

The authors thank the Institute of Information Science and Technologies (ISTI) of the National Research Council (CNR), Italy, for supporting the data collection and simulation. The author also thanks Tarun Chawla and Remcom Inc. for providing access to the Wireless InSite X3D ray tracer, which supported the generation of the dataset presented in this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

UAV	Unmanned Aerial Vehicle
G2A	Ground-to-air
A2G	Air-to-ground
A2A	Air-to-air
ML	Machine learning
IQ	In-phase and quadrature
GPS	Global Positioning System
RSS	Received signal strength
LOS	Line-of-sight
CIR	Channel impulse response
CSI	Channel state information
MMWave	Millimeter-wave
SNR	Signal-to-noise ratio
3GPP	3rd Generation Partnership Project
DXF	Drawing Exchange Format
OSM	OpenStreetMap
ToA	Time of arrival

References

Asadzadeh, S.; de Oliveira, W.J.; de Souza Filho, C.R. UAV-based remote sensing for the petroleum industry and environmental monitoring: State-of-the-art and perspectives. J. Pet. Sci. Eng. 2022, 208, 109633. [Google Scholar] [CrossRef]
Grzybowski, J.; Latos, K.; Czyba, R. Low-cost autonomous UAV-based solutions to package delivery logistics. In Advanced, Contemporary Control, Proceedings of the KKA 2020—The 20th Polish Control Conference, Łódź, Poland, 7–9 October 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 500–507. [Google Scholar]
Xiao, N.; Wen, W.; Hu, J.; Yang, P.; Zhao, J.; Wu, C.; Bai, S. SUG-UAV Multirotor Dataset with Multi-sensor Integration in Indoor and Urban Areas. In Proceedings of the 2024 14th International Conference on Indoor Positioning and Indoor Navigation (IPIN), Kowloon, Hong Kong, 25–27 September 2024; pp. 1–5. [Google Scholar] [CrossRef]
Raouf, A.H.F.; Lee, D.; Rahman, M.; Masrur, S.; Reddy, G.; Dickerson, C.; Hossen, M.S.; Villar, S.V.; Gürses, A.; Singh, S.; et al. Wireless Datasets for Aerial Networks. arXiv 2025, arXiv:2510.08752. [Google Scholar] [CrossRef]
Khuwaja, A.A.; Chen, Y.; Zhao, N.; Alouini, M.S.; Dobbins, P. A Survey of Channel Modeling for UAV Communications. IEEE Commun. Surv. Tutor. 2018, 20, 2804–2821. [Google Scholar] [CrossRef]
Zhang, Y.; Doshi, A.; Liston, R.; Tan, W.; Zhu, X.; Andrews, J.; Heath, R. DeepWiPHY: Synthetic and Real-World IEEE 802.11ax OFDM Symbol Dataset. IEEE Dataport. 2020. Available online: https://ieee-dataport.org/open-access/deepwiphy-synthetic-and-real-world-ieee-80211ax-ofdm-symbol-dataset (accessed on 22 March 2026).
Hussain, S.; Bacha, S.F.; Cheema, A.A.; Canberk, B.; Duong, T.Q. Geometrical Features based mmWave UAV Path Loss Prediction using Machine Learning for 5G and Beyond. IEEE Open J. Commun. Soc. 2024, 5, 5667–5679. [Google Scholar] [CrossRef]
Roy, S.; Majumdar, S.; Swetnam, T. Samapriya/Awesome-Gee-Community-Datasets: Community Catalog (3.9.0). 2025. Available online: https://zenodo.org/records/17641528 (accessed on 22 March 2026).
Gill, J.S.; Velashani, M.S.; Wolf, J.; Kenney, J.; Manesh, M.R.; Kaabouch, N. Simulation Testbeds and Frameworks for UAV Performance Evaluation. In Proceedings of the 2021 IEEE International Conference on Electro Information Technology (EIT), Mt. Pleasant, MI, USA, 14–15 May 2021; pp. 335–341. [Google Scholar] [CrossRef]
Mozny, R.; Masek, P.; Stusek, M.; Molnar, K.; Palenska, M.; Moltchanov, D.; Hosek, J. Experimental Quality Assessment of Cellular Networks and Their Utilization for UAV Services. In Proceedings of the IEEE Vehicular Technology Conference (VTC), Florence, Italy, 20–23 June 2023; pp. 1–6. [Google Scholar]
Braunfelds, J.; Jakovels, G.; Murans, I.; Litvinenko, A.; Senkans, U.; Rumba, R.; Onzuls, A.; Valters, G.; Lidere, E.; Plone, E. Experimental Study on LTE Mobile Network Performance Parameters for Controlled Drone Flights. Sensors 2024, 24, 6615. [Google Scholar] [CrossRef] [PubMed]
Ruseno, N.; Ongkowijoyo, H.V.; Lin, C.Y. Analysis of 4G Signal Quality in the UAS Network Remote ID Using Machine Learning Methods. J. Aeronaut. Astronaut. Aviat. 2024, 56, 555–568. [Google Scholar]
Wang, S.; Li, S.; Zhang, Y.; Yu, S.; Yuan, S.; She, R.; Guo, Q.; Zheng, J.; Howe, O.K.; Chandra, L.; et al. UAVScenes: A Multi-Modal Dataset for UAVs. arXiv 2025, arXiv:2507.22412. [Google Scholar] [CrossRef]
Dickerson, C.; Raouf, A.H.F.; Ozdemir, O.; Guvenc, I.; Sichitiu, M. AERPAW UAV-Based Signal Data Collected at Varying Altitudes and Sampling Rates for Wireless Communication Studies. 2025. Available online: https://datadryad.org/dataset/doi:10.5061/dryad.2z34tmpvv?public=true (accessed on 22 March 2026).
Gürses, A.; Sichitiu, M.L. Air-to-Ground Channel Modeling for UAVs in Rural Areas. In Proceedings of the 2024 IEEE 100th Vehicular Technology Conference (VTC2024-Fall), Washington, DC, USA, 7–10 October 2024; pp. 1–6. [Google Scholar] [CrossRef]
Colpaert, A.; Thys, C.; Cui, Z.; Pollin, S. MaMIMO-UAV 3D Channel State Information Dataset. 2023. Available online: https://rdr.kuleuven.be/dataset.xhtml?persistentId=doi:10.48804/0IMQDF (accessed on 22 March 2026).
Polese, M.; Bertizzolo, L.; Bonati, L.; Gosain, A.; Melodia, T. An Experimental mmWave Channel Model for UAV-to-UAV Communications. In Proceedings of the ACM Workshop on Millimeter-Wave Networks and Sensing Systems (mmNets), London, UK, 25 September 2020. [Google Scholar]
Catak, F.O.; Kuzlu, M.; Catak, E.; Cali, U.; Guler, O. Defensive Distillation-Based Adversarial Attack Mitigation Method for Channel Estimation Using Deep Learning Models in Next-Generation Wireless Networks. IEEE Access 2022, 10, 98191–98203. [Google Scholar] [CrossRef]
Suo, J.; Wang, T.; Zhang, X.; Chen, H.; Zhou, W.; Shi, W. HIT-UAV: A High-Altitude Infrared Thermal Dataset for Unmanned Aerial Vehicle-Based Object Detection. Sci. Data 2023, 10, 227. [Google Scholar] [CrossRef] [PubMed]
Colpaert, A. 3D Massive MIMO Air-to-Ground UAV CSI Dataset in Campus Environment. 2025. Available online: https://rdr.kuleuven.be/dataset.xhtml?persistentId=doi:10.48804/MTNAEG (accessed on 22 March 2026).
3GPP. Study on Enhanced LTE Support for Aerial Vehicles; Technical Report TR 36.777 V15.0.0; Release 15; 3rd Generation Partnership Project (3GPP): Sophia Antipolis, France, 2017. [Google Scholar]
3GPP. Study on Channel Model for Frequencies from 0.5 to 100 GHz; Technical Report TR 38.901 V16.1.0; Release 16; 3rd Generation Partnership Project (3GPP): Sophia Antipolis, France, 2020. [Google Scholar]
ITU-R. Propagation Data and Prediction Methods for the Planning of Short-Range Outdoor Radiocommunication Systems and Radio Local Area Networks in the Frequency Range 300 MHz to 100 GHz; Technical Report P.1410-5; International Telecommunication Union: Geneva, Switzerland, 2012. [Google Scholar]

Figure 1. Geometrical layout of several UAV communication scenarios. (a) Rural (Tennessee, USA). (b) Sub-Urban (Palo Alto, California). (c) Urban (PISA-West, Italy). (d) Urban (Seoul, South Korea).

Figure 2. 3D schematic of the UAV communication sampling layout used in the dataset generation.

Figure 3. Average number of resolved paths per link in static scenarios.

Figure 4. Average interaction complexity per path in static scenarios.

Figure 5. Average number of resolved paths per link in the mobile benchmark scenarios. The plot summarizes the path richness observed along sampled UAV routes across rural, suburban, and urban settings.

Figure 6. (Left): Overlay of DXF (red) and GeoJSON (purple) building footprints in the ENU coordinate frame. (Right): GeoJSON buildings colored by Intersection-over-Union (IoU) with the best-matching DXF polygons.

Table 1. Comparison of Publicly Available UAV Wireless Communication Datasets.

Dataset	Real/ Synthetic	3D Geometry	Mobility	Multi Frequency	ML-Ready Structure	Public Access
[14]	Real	Partial (GPS, altitude)	Yes	No	Medium (raw IQ)	Yes
[15]	Real	Partial (position + CIR)	Yes	No	Medium	Limited
[16]	Real	Yes (3D trajectory)	Yes	No	High (structured CSI)	Yes
[17]	Real	Limited	Limited	No	Medium	Limited
[20]	Real	Yes (aerial corridor geometry)	Yes	No	High	Yes
[18]	Synthetic	No explicit geometry	No	Yes	High (labeled)	Limited
Proposed	Synthetic	Yes	Yes	Yes	High	Yes

Table 2. Simulation area boundaries and environment types.

ID	Location	Latitude	Longitude	Approx. Size	Environment Type
1	Manhattan, Kansas	$39.18154 : 39.18619$	$- 96.57281 : - 96.56654$	$\sim 516 m \times 538 m$	Urban
2	Seoul, South Korea	$37.57110 : 37.57723$	$127.04500 : 127.05337$	$\sim 680 m \times 736 m$	Urban
3	Boston, Massachusetts	$42.33801 : 42.34657$	$- 71.07675 : - 71.06496$	$\sim 950 m \times 970 m$	Urban
4	London, UK	$51.46947 : 51.47810$	$- 0.21610 : - 0.19406$	$\sim 958 m \times 1520 m$	Urban
5	Pisa-West, Italy	$43.7061 : 43.7277$	$10.3834 : 10.4170$	$\sim 2398 m \times 2698 m$	Urban
6	Pisa-East, Italy	$43.7061 : 43.7277$	$10.4170 : 10.4357$	$\sim 2398 m \times 1500 m$	Urban
7	Palo Alto, California	$37.37839 : 37.38463$	$- 122.15039 : - 122.14205$	$\sim 693 m \times 734 m$	Suburban
8	Lewistown, Montana	$47.15090 : 47.15567$	$- 110.22465 : - 110.21674$	$\sim 530 m \times 596 m$	Suburban
9	Black Hills, South Dakota	$43.76795 : 43.77679$	$- 103.58235 : - 103.56946$	$\sim 981 m \times 1031 m$	Suburban
10	Tennessee, USA	$35.07622 : 35.08561$	$- 85.08117 : - 85.06418$	$\sim 1042 m \times 1546 m$	Rural
11	Texas, Plains	$33.18083 : 33.19441$	$- 102.83441 : - 102.81272$	$\sim 1508 m \times 2015 m$	Rural

Table 3. Deployment Parameters Defined by 3GPP and ITU-R Standards.

Parameter	Standard Value/Range	Reference
UAV Altitude (urban)	15–120 m (typical urban), up to 300 m	3GPP TR 36.777
UAV Altitude (general max)	Up to 300 m	3GPP TR 36.777
Ground Node Altitude	1.5 m (outdoor UE height)	3GPP TR 38.901
Deployment Environments	UMi (Urban Micro), UMa (Urban Macro), RMa (Rural Macro)	3GPP TR 38.901
Simulation Area Size	250 m–5 km side length depending on scenario	3GPP TR 38.901
Node Layouts	Hexagonal grid, random drop, 3-sector BSs	3GPP TR 36.777
UAV Density	10 UAVs/200 km² (typical) to 5 UAVs / 400 × 400 m² (dense)	3GPP TR 36.777
LOS/NLOS Path Loss Modeling	Height- and environment-dependent LOS/NLOS models	3GPP TR 38.901
UAV-Specific Propagation Modeling	Line-of-sight, diffraction, and clutter loss over irregular terrain	ITU-R P.1410-5

Table 4. Deployment Configuration for Ground and UAV Nodes.

ID(s)	GN Layout	GN Nodes	GN Alt. (m)	UAV Layout	UAV Nodes	UAV Alt. (m)
1	Grid	625	1.5	Grid	20	{30, 60, 90, 120}
2	Grid	625	1.5	Grid	48	{30, 60, 90, 120}
3	Random	30	[0.5–10]	Random	30	[40–110]
4	Random	40	[0.5–10]	Random	30	[90–110]
5, 6	Random	40	[0.5–10]	Random	30	[40–110]
7, 10, 11	Grid	625	1.5	Grid	80	{30, 60, 90, 120}
8, 9	Grid	625	1.5	Grid	48	{30, 60, 90, 120}

Note: IDs refer to simulation areas listed in Table 2.

Table 5. Material properties of geometries.

Geometry	Material Type	Parameter’s Value
Terrain	Dielectric half-space (Dry earth)	Roughness (m) = 0 Conductivity (S/m) = 0.001 Permittivity = 4
City	One-layer dielectric (Concrete)	Thickness (m) = 0.3 Conductivity (S/m) = 0.015 Permittivity = 7
Foliage (Vegetation, Forest)	One-layer dielectric (Wood)	Thickness (m) = 0.03 Roughness (m) = 0 Conductivity (S/m) = 0 Permittivity = 5
Asphalt	Dielectric half-space	Roughness (m) = 0 Conductivity (S/m) = 0.0005 Permittivity = 5.72

Table 6. Propagation model parameters and environmental conditions.

Parameter	Value
Propagation model	X3D
Ray spacing	0.25 Degree
Number of reflections	6
Number of transmissions	0
Number of diffractions	1
Foliage Model	Weissberger Model
Atmosphere—Temperature (°C)	22
Atmosphere—Pressure (mbar)	1013
Atmosphere—Humidity (%)	50

Table 7. UAV Configuration and Movement Parameters Across Different Locations.

Parameter	Palo Alto	Tennessee	Seoul
Waypoint and Movement Strategy	Constant Speed
Speed (m/s)	20	20	20
Acceleration/ Deceleration	No	No	No
Distance per Increment (m)	20	20	20
Elevation in the Route (AGL) (m)	90	60	100
Total Timesteps	609	1068	301
Ground Node Mobility	Static	Static	Static
Interaction with the Environment	Signal propagation influenced by environmental geometries (e.g., buildings, terrain)
UAV Movement Patterns	Grid-based: the UAV moves in a fixed grid pattern over the area.

Table 8. Encoding of Physical and Logical Events in Interaction Patterns.

Code	Event Type	Code	Event Type
1	Transmitter	5	Foliage Entry/Exit
2	Reflection/Ground Bounce	6	Diffuse Scattering
3	Transmission	9	Receiver
4	Diffraction

Table 9. Relationship between file formats, represented information, and typical downstream use.

Format/ File Type	Modal	Information Represented	Typical Downstream Use
CSV (node metadata)	Communication Modal	UAV and GN identifiers, ENU coordinates, altitude and node role.	Node feature construction, spatial indexing, geometry-aware learning.
CSV (link-level statistics)	Communication Modal	Link descriptors: received power, path loss, delay metrics, LOS/NLOS/Blocked state.	Path-loss prediction, link-state classification, channel modeling.
CSV (path-level descriptors)	Communication Modal	Per-path delay, power, phase and AoA/AoD angles.	Multipath analysis delay regression, ML feature extraction.
CSV (interaction points)	Communication Modal	Ray interaction sequence and 3D interaction points.	Propagation mechanism analysis and explainable modeling.
OSM/ GeoJSON	Environment Modal	Map-level geographic data: roads, terrain, building footprints.	Scene context extraction, map-based features, spatial alignment.
DXF	Environment Modal	CAD geometry and structured scene elements.	Geometry parsing and scene reconstruction.
OBJ/STL	Environment Modal	3D mesh representation of buildings and objects.	Mesh analysis, visibility and obstruction modeling.

Table 10. Dataset Sample Statistics by Scenario, Environment, and Frequency.

Environment	Mobility Model	Frequency	Scenario	Samples
Environment	Mobility Model	Frequency	Scenario	TX-RX	G2U	U2G	U2U
Rural	Static	Sub-6 GHz	Tennessee	80 + 625	L:50000 P:980400 I:4245995	L:50000 P:907180 I:3733890	L:6400 P:157095 I:625400
		Sub-6 GHz	Texas Plains	80 + 625	L:50000 P:1141564 I:4560975	L:50000 P:1139256 I:4279950	L:6400 P:159193 I:613391
		mmWave	Tennessee	80 + 625	L:50000 P:976436 I:4198453	L:50000 P:906285 I:3706887	L:6400 P:156999 I:620159
	Mobile	Sub-6 GHz	Tennessee	1 + 625	–	L:667500 P:11714080 I:47644849	–
Suburban	Static	Sub-6 GHz	Black Hills (South Dakota)	48 + 625	L:30000 P:717437 I:2923089	L:30000 P:714217 I:2812816	L:2304 P:57578 I:233585
			Lewistown (Montana)	48 + 625	L:30000 P:739235 I:3040440	L:30000 P:738135 I:2939422	L:2304 P:57600 I:217048
			Palo Alto	80 + 625	L:50000 P:1066784 I:4544912	L:50000 P:1039047 I:4328734	L:6400 P:157066 I:632026
		mmWave	Palo Alto	80 + 625	L:50000 P:1066650 I:4530886	L:50000 P:1038980 I:4314615	L:6400 P:157063 I:629725
	Mobile	Sub-6 GHz	Palo Alto	1 + 625	–	L:380625 P:8079314 I:33451151	–
Urban	Static	Sub-6 GHz	Manhattan	20 + 625	L:12500 P:275523 I:1136059	L:12500 P:274071 I:1108362	L:400 P:10000 I:38277
		Sub-6 GHz	Seoul (South Korea)	48 + 625	L:30000 P:614941 I:3072982	L:30000 P:594546 I:2833081	L:2304 P:55307 I:240496
		NB-IoT	Boston	30 + 30	L:900 P:19456 I:88123	L:900 P:19091 I:83073	L:900 P:22500 I:92290
			Pisa East	30 + 40	L:1200 P:29087 I:142053	L:1200 P:27383 I:125173	L:900 P:22480 I:95536
			Pisa West	30 + 40	L:1200 P:26874 I:141049	L:1200 P:21008 I:99918	L:900 P:22307 I:86596
			London	30 + 40	L:1200 P:28788 I:133085	L:1200 P:25823 I:112276	L:900 P:22500 I:87914
		mmWave	Seoul (South Korea)	48 + 625	L:30000 P:614909 I:3073038	L:30000 P:594528 I:2820082	L:2304 P:55305 I:242270
	Mobile	Sub-6 GHz	Seoul (South Korea)	1 + 156 1 + 169 1 + 144 1 + 156	U2G
					BL	BR	TL	TR
					L:168125 P:904183 I:4102178	L:167500 P:832557 I:3626844	L:175000 P:931798 I:3690006	L:175000 P:922397 I:4447626

Table 11. Cross-source geometric alignment metrics between DXF and GeoJSON buildings.

Metric/Match Type	Value
Strong Matches	520/97%
Shifted Matches	7/1.3%
No Match	9/1.6%
Mean IoU	0.86 (median 0.89)
Mean centroid displacement	0.74 m
Mean footprint area ratio	0.997

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alibabaie, N.; Calabrò, A.; Marchetti, E. A Multi-Modal Benchmark Dataset for UAV Wireless Communication Research. Drones 2026, 10, 244. https://doi.org/10.3390/drones10040244

AMA Style

Alibabaie N, Calabrò A, Marchetti E. A Multi-Modal Benchmark Dataset for UAV Wireless Communication Research. Drones. 2026; 10(4):244. https://doi.org/10.3390/drones10040244

Chicago/Turabian Style

Alibabaie, Najmeh, Antonello Calabrò, and Eda Marchetti. 2026. "A Multi-Modal Benchmark Dataset for UAV Wireless Communication Research" Drones 10, no. 4: 244. https://doi.org/10.3390/drones10040244

APA Style

Alibabaie, N., Calabrò, A., & Marchetti, E. (2026). A Multi-Modal Benchmark Dataset for UAV Wireless Communication Research. Drones, 10(4), 244. https://doi.org/10.3390/drones10040244

Article Menu

A Multi-Modal Benchmark Dataset for UAV Wireless Communication Research

Highlights

Abstract

1. Introduction

2. Available Datasets

3. Simulation Setup

Scope and Limitations of the Current Release

4. Dataset Description

Cross-Source Geometry Alignment and Validation

5. Benchmark Tasks Enabled by the Dataset

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI