Measurements of LoRaWAN Technology in Urban Scenarios: A Data Descriptor

: This work is a data descriptor paper for measurements related to various operational aspects of LoRaWAN communication technology collected in Brno, Czech Republic. This paper also provides data characterizing the long-term behavior of the LoRaWAN channel collected during the two-month measurement campaign. It covers two measurement locations, one at the university premises, and the second situated near the city center. The dataset’s primary goal is to provide the researchers lacking LoRaWAN devices with an opportunity to compare and analyze the information obtained from 303 different outdoor test locations transmitting to up to 20 gateways operating in the 868 MHz band in a varying metropolitan landscape. To collect the data, we developed a prototype equipped with a Microchip RN2483 Low-Power Wide-Area Network (LPWAN) LoRaWAN technology transceiver module for the ﬁeld measurements. As an example of data utilization, we showed the Signal-to-noise Ratio (SNR) and Received Signal Strength Indicator (RSSI) in relation to the closest gateway distance. Dataset: Available on GitHub: https://github.com/BUTResearch/MDPI_Data_Urban_LPWA


Introduction
Today, the definition of a smart machine includes devices that can exchange data with other devices-i.e., Machine-to-Machine (M2M) approach or the cloud-execute a number of specific commands, and change the logic of work depending on external conditions, to cope with work that previously needed a person, i.e., Human-to-Machine (H2M) [1,2]. High functionality in a small size, which, in a few touches, satisfies a large number of human needs, from communication to health monitoring, makes smart devices a part of the Internet of Things (IoT, which also covers massive Machine-Type Communications (mMTC), one of the fastest-growing technologies of the last decade [3][4][5][6]. Experts agree that the smart device market will continue to grow in the near future due to their convenience, broad capabilities, and new "interesting features" [7]. Moreover, the number of connected smart devices has doubled over the past five years, from 13 to 26 billion devices worldwide, and is estimated to reach the 38.6 billion milestone by 2025 and 50 billion by 2030 [8]. The largest share of this portion of the IoT market belongs to smartphones (approximately be used for different purposes, such as investigating the changes in network parameters in the long term or testing the algorithms to validate their appropriateness for certain applications. A large fingerprinting dataset [32] was collected using postal cars in Antwerp over 3 months with 68 GWs scattered throughout the city. The dataset contains information reported by the GWs and the real positions where uplink packets were sent from.
LoRaWAN is an in-demand technology that has carved its niche among communication solutions for the IoT. It seems to be most applicable when we do not need high throughput but need the system's low cost and high reliability, especially for resourceconstrained devices. The technology is improving, and statistics are required to explore and expand its potential, though there are very few open-source datasets. Therefore, in this paper, we provide a LoRaWAN coverage dataset collected in Brno, Czech Republic. The motivation to prepare this dataset was pushed forward by encouraging researchers to work with LoRaWAN technology to identify its application prospects and benefits as a communication solution for smart devices.
The rest of the paper is organized as follows. First, Section 2 provides an overview of LoRaWAN technology. Section 3 covers the data description. Further, Section 4 describes the preparation phase of the measurements and outlines the measurement setup for both short-and long-term measurement campaigns. The next section visualizes the most important findings from both measurement campaigns. Finally, the last section provides the summary of the data descriptor.

LoRaWAN Technology Overview
The selected LoRaWAN technology is arguably one of the most well-known representative of LPWAN technologies operating in the Industrial, Scientific, and Medical (ISM) (unlicensed) frequency band. The LoRaWAN network infrastructure consists of end devices (ED), GWs, and a network server (NS). Besides these elements, the network may contain a specialized network join (NJ) server to handle roaming between networks. The bare minimum LoRaWAN network consists of a single GW and the NS (which may be integrated into the GW). These two elements are typically connected through an IP-based interface. The functionality of the NS and the user application programming interfaces (APIs) might differ depending on the operator. In most cases, the transmission is initiated by an ED by using an Aloha-like channel access mechanism [34]. Therefore, the ED can initiate communication at any time, without violating the selected radio channel's operational restriction. Significantly, LoRaWAN belongs to ISM band. The key parameters can be found in Table 1, making the operation free in contrast to, for example, cellular-based NB-IoT tech-nology operating in the licensed LTE band. The utilization of unlicensed bands is regulated by the government (in the case of the Czech Republic, the Czech Telecommunication Office is the regulator) and therefore has some operational restrictions.
The LoRaWAN provides two options for infrastructure deployment. End-users can use the infrastructure provided by communication operators by paying for their services, see Figure 1 with the description of the blocks in Table 2. An alternative is the deployment of private LoRaWAN infrastructure.
The LoRaWAN network protocol does not have a strict specification for the used modulation technique, but an arbitrary method can be used. However, the most widely used is the Long-Range (LoRa) modulation patented by Semtech [35]. This modulation is based on a spread spectrum technique called Chirp Spread Spectrum (CSS). The spreading of the spectrum is achieved by generating a chirp signal that continuously varies in frequency. The data are then modulated on top of the chirp, spreading a narrowband signal over a wider bandwidth (of at least 125 kHz). The resulting signal has a spectral signature of noise, which makes it harder to detect or jam [36], and is resilient to a narrowband noise.
Furthermore, the chirp waveform frequency characteristics-i.e., the slope (directly affecting the symbol duration) of the LoRa modulation-can be modified by the spreading factor (SF) parameter ranging from 7 to 12. With slower frequency variation, each symbol is emitted with higher energy, thus enabling longer communication ranges at the cost of a lower data rate and longer on-air time. In general, the achievable bit rate varies from 250 (for SF12/125 kHz) to 21,900 bps (SF7/500 kHz), resulting in a maximum payload size from 11 (SF10-US region) up to 242 bytes (SF7-most of the regions). Due to the need for power efficiency, the EDs are commonly configured to use the lowest SF that allows reliable communication [37]. The LoRaWAN protocol also defines channel coding-forward error correction. The coding rate parameter value indicates the ratio between the original information's bits and the redundant bits, which also adds another overhead.  The LoRaWAN ED may start its uplink transmission at any time, see Figure 2. An ED randomly selects one of up to 16 available frequency channels for its transmission (valid for the EU region). The SF and the transmit power used by the ED are either predefined or, specified (either by the ED itself or by the NS)-which is strongly recommended-based on the radio channel conditions between the ED and the nearest GW. The maximum payload for LoRaWAN uplink packets depends on the SF used and is at least 51 bytes (for the EU region). The number of messages per day is not limited either, but the ED is required to obey local restrictions concerning the maximum duty cycle. Packet retransmission in LoRaWAN is optional and rarely used.

TR-069
Technical specification that defines an application protocol for remote management of network devices.

HTTPS
Hypertext transfer protocol (HTTP) with encryption protocol TLS (SSL), mostly used for web browsing.

Routing Services
Network server Implements LoRaWAN protocol, validates authenticity and integrity of devices, packet deduplication, organizes downlink transmission, and many others.
Routing server Routes traffic based on the customer's account and the selected server.

Database
Data storage for billing server.
Server for A, B, and C Customer server that handles end-device join procedures, key storage, and management.
Positioning server Provides position calculation (estimation) of end devices in the network.
Billing server Handles the tracking of billable products and services.

GNMS
(Global network management system) Provides remote monitor and control functionality.

Oauth2
Industry-standard protocol for authorization that provides API clients limited access to user data on a web server.

RESTful
Architectural style for API (application program interface) that uses HTTPS requests to access and use data.
Customer-Specific API Proprietary protocol implementation.  The downlink functionality of the LoRaWAN ED depends on its class. Each Lo-RaWAN ED operating in class A opens two Receive Windows (RWs) following its uplink transmission. The first RW (RW1) follows after the preagreed waiting time (typically 1 s) and is opened in the same frequency channel that the ED used for its uplink transmission. The second RW (RW2) follows the first window after 1 s and is opened in the preagreed frequency channel (in the EU, typically in the 869.5 MHz channel). For SF used by the ED, RW1 depends on its SF for uplink, while RW2 is always opened with a predefined SF. An ED having only these RWs is classified as ED class A. The devices of class B are required to synchronize with the network and open periodic RWs at prespecified time slots. Finally, class C ED listens the entire time it is not transmitting or having RW1.
In addition to the duty cycle restrictions, there is also a limit on transmission power, especially for resource-constrained devices. The power values are defined here as Effective Radiated Power (ERP) values-the power that must be given to a reference half-wave dipole antenna to get the same electrical field strength as the actual device at the same distance in the direction of the antenna gain. Another often-used definition is Effective Isotropic Radiated Power (EIRP)-the power that must be given to a reference isotropic antenna to get the same electrical field strength as the actual device at the same distance. The EIRP and ERP can be converted into each other using P EIRP = P ERP + 2.15 dB, if the powers are expressed in dBm [38]. Maximum transmission power limits are given by the LoRaWAN regional specification and by each country's local regulations, which must be verified in advance. According to the LoRaWAN specification, EIRP limits generally range between 12.15-30 dBm [39].

LoRaWAN Regional Parameters for EU
There are two frequency plans available for the European continent: EU863-870 and EU433. Since the Czech CRA agency operates the ISM band 868 MHz network, only the first frequency plan will be discussed in more detail. Table 3 shows the relationship between the data rate (DR), which is a numeric value representing the spreading factor, bandwidth, and coding rate settings in the specified region, and the bit rate. For EU863-870, a coding rate of 4/5 is used by default. The table shows that the lower the data rate, the lower the bit rate (but this extends the communication coverage). Six SF values using 125 kHz-bandwidth (namely, SF7 to SF12) are specified. For these, the on-air data throughput varies from 250 bps up to 5.47 kbps depending on the selected SF. Additionally, there is also one channel with a bandwidth of 250 kHz and SF7 with a throughput of 11 kbps [40,41]. Moreover, the channels with higher bandwidth or high-speed Frequency Shift Keying (FSK) modulation may be defined.
Each ED selects one of the sixteen available channels in frequency bands from 863 to 870 MHz. However, the condition is that each device (GWs and EDs) must also communicate on the default channels listed in Table 4. The ED must also comply with the duty cycle restriction and the maximum transmission power. The selected ISM frequency band of 868 MHz imposes the limitation of a 1% duty cycle with the maximum transmission power of 14 dBm (25 mW) [42].

Data Description
The provided open-access dataset consists of JavaScript Object Notation (JSON) records stored in Comma-Separated Values (CSV) files, and the data were gathered in a span of multiple hours during two days of measurements. Each JSON file contains parameters as described below. In addition to the payload itself, every record on the server also contains additional metadata. Metadata contains general information about the LoRaWAN message and the array of parameters that provide more detailed message reception information for each GW receiving the message separately. Notably, these names may differ between LoRaWAN service providers. In the case of Ceske Radiokomunikace (CRa), the metadata contains the following parameters [ dr-Data Rate: The string parameter specifying the spreading factor, bandwidth, and coding rate. The spreading factor fundamentally affects the data rate and thus, the message time on-air. The value can be selected from the interval 7 to 12. Bandwidth values are only 125, 250, and 500 kHz. The larger the bandwidth, the higher the data rate. 10. ack-Acknowledge: The parameter is of a Boolean type and indicates whether the ED requires confirmation of the sent message. The default is to avoid using acknowledgments to reduce network traffic. 11. gws-Gateways: Contain an array of information objects from individual GWs, especially information about the parameters of the received signal, timestamp, identifier, and location of the GW.  12. bat-Battery status of the ED 8-bit integer value (0-external power supply, 255battery status is unknown, 1-254-correspond to battery status 0-100%). 13. data-The field contains HEX data, which is unique for the LoRaWAN device in question. It consists of information related to temperature, position, battery level, etc.
In the case of our device, it represents our unique data format, which is specifically designed for the purposes of our measurements. 14. device_Lat-Latitude of the measurement point gathered from the GPS. 15. device_Lon-Longitude of the measurement point gathered from the GPS.
An example of one entry could be found in the following listing: The undeniable advantage of the JSON format is that it is in a human-readable form. Thus, without the need for complex parsing, necessary information can be read immediately.

Measurement Details
To perform the measurement campaign, we designed and constructed the LPWA device prototype capable of transmitting data using diverse LPWA communication tech-nologies. In the case of this paper, we take into account the LoRaWAN technology, i.e., the data transmissions utilizing the license-exempt frequency band. This section contains information related to the aforementioned construction of the device and discusses the executed measurement campaigns.

Evaluation Preparation
For the measurement campaign, a measurement prototype was developed and subsequently used for in-depth evaluation of energy consumption and communication capabilities of LoRaWAN-a prototype equipped with a Microchip RN2483 Low-Power Wide-Area LoRaWAN technology transceiver module; its main technical details are given in Table 5 while the actual module is depicted in Figure 3.   Before the measurements, the constructed prototype was tested in the temperature Vötsch VC3 7018 chamber as well as in the Electromagnetic Compatibility (EMC) anechoic chamber as they are both part of testing laboratories at Brno University of Technology, see Figure 5. Only the devices that passed these tests successfully, e.g., operating in the temperature range from −18 • C to +85 • C, were included in testing.   In this work, we used the DC power analyzer Agilent N6705A [45] for current measurements as the power consumption was measured through dedicated test pins on a custom-designed board. While conducting the measurements, a significant growth of power consumption with increasing temperature was measured for the module in question, RN2483. The current consumption growth is caused by natural behavior of the semiconductor junction. Referring to the performed measurements, at the temperature +85 • C, the current consumption in power-saving mode raised up to 20 µA, which is more then 6 times higher consumption compared to +25 • C.
The power consumption characteristics of the LoRaWAN module depicted in Figure  6 clearly indicate that the device operates in Class A mode. The short spike around 40 mA represents the message transmission of 50 B utilizing SF7. After the predefined 1 s delay, first, receive window RX1 is opened. Notably, when the confirmed message is sent, the duration of RX1 is longer than in the case of unconfirmed transmission. This extension is caused by the fact that actual data are transmitted during RX1 when an acknowledgment is sent. However, when the data is successfully received, no additional reception window (RX2) is opened. On the other hand, the unconfirmed mode still requires the opening of RX2 after the predefined delay of 2 s after data transmission. As a result, it may even lead to a decreased power consumption when the confirmed mode is used. The actual measurements, depicted in Figures 7 and 8, verify the theory mentioned above, as the confirmed transmission indicates slightly lower power consumption in all combinations of message sizes and spreading factors. From the message size of 100 B, the higher spreading factors are limited due to DC restriction; therefore, they are omitted in the results. The influence of the abovementioned effect is visible, especially for the small message sizes and low SF values. Logically, the RX2 represents a more significant part of the current consumption when the transmission time is shorter. The new important finding is connected with the increasing SF value. Up to SF10, the power consumption rises nearly linearly. However, from SF11, the increase is more pronounced even though the transmission time doubles with each increase in SF. It is logical to assume that up to SF10, the reception windows represent a significant part of consumed energy.

City-Scale Measurement Campaign
The measurements campaign was executed in the urban area of Brno, Czech Republic. The test area covered around a 35-km radius with 231 test locations mainly located next to the public bus stops, see Figure 4 (The map of measurement points with the location data is accessible online via Google Maps). This area is covered with at least 20 CRA LoRaWAN network's GWs, and the measurements were taken for all three default channels of 868 MHz band, with the 4/5 (default) coding rate, maximum SF 12, and maximum transmit power of 14 dBm to achieve maximum communication coverage.
The measurement class A devices were equipped with the vertically aligned dipole antennas and installed one meter above ground-level (the board was tilted 90 o to achieve the horizontal position), set up in test locations, and requested to transmit a set of five packets. All duplicates were kept during the transmissions (allowing us to analyze spatial diversity impact on an actual multi-GW LoRaWAN system). Next, the data was repetitively collected and automatically uploaded to centralized storage after the measurements.

Long-Term Measurement Campaign
With the ultimate goal of exploring long-term channel characteristics, we conducted a measurement campaign of LoRaWAN technology spanning over two months. One communication unit was placed on the rooftop of the BUT building approximately 25 m above ground-level for these measurements. This configuration represents a suburban scenario, as the university building is located in the Brno suburb. The second device was deployed on the windowsill of the authors' flat situated near the city center. It is expected that the unit located in the city center will experience higher signal fluctuations due to the increased number of interference sources (surrounding buildings, passing cars, trams, etc.).
Both devices were set to transmit 12 B messages in unconfirmed mode every 90 min. The transmit power was set to 14 dBm with SF12 and a coding rate of 4/5 to provide maximum communication range. The reason for the selection of SF12 configuration is due to the measurement campaign goal being initially related to the scenario, where the success rate was the key performance indicator. During the two months, we gathered more than 1000 messages from each device.

Measurements Results
This section provides the most important results obtained after postprocessing of the collected data.

City-Scale Measurements Results
The presented data include the outage probability of the measurement-node from the closest LoRaWAN GW, see Figure 9. These values are derived via the Euclidean distance between each ED and GW. For better visualization, the distances are divided into 23 bins with 250-m steps. The results show that in the case of the LoRaWAN network's real deployment in Brno, the maximum distance to the closest GW does not exceed 8 km. Notably, half of the nodes were within a 1 km radius from the GW and 80% in 2 km. Significantly, the threshold for successful packet delivery was also in a range of 1-2 km, since 50% of nodes in this distance experienced at least one packet loss. Test points served Test points in outage Figure 9. Effect of the distance to the nearest GW on the outage probability.
The second part of measurement results corresponds to actual wireless channel characteristics, e.g., the distributions of RSSI and SNR for successfully delivered packets. We recall the map of measurement points with the location data that are accessible online via Google Maps. Figures 10 and 11 indicate the impact of the communication range between the EDs and GWs on the key radio parameters, i.e., SNR and RSSI. Interestingly, Figures 10 and 11 show that some packets were also delivered from distances over 50 km. Notably, the observed RSSI values range from −64 to −125 dBm (dynamic range of approximately 34 dB; the SNR is in the range from −20 to 14 dB), though SF12 is designed to typically work at −137 dBm. Nevertheless, during our campaign, we did not capture any data transmissions below -127 dBm. Additionally, referring to the SNR captured during the measurements, the SNR levels align with the theoretical assumptions as they do not cross the threshold of −20 dB.  The interrelation between the discussed parameters is given in Figure 12. Here, we can observe a significant deviation of both parameters. The Figure 12 illustrates that the dependency between the SNR and RSSI is linear and SNR is above 0 dB only in ideal radio conditions. For the rest (levels below −100 dBm), it is all over the place. Therefore, the results suggest that the SNR represents a more limiting factor that the RSSI as an even sample with low RSSI may have relatively good SNR.

Long-Term Measurement Results
The long-term RSSI measurement results depicted in Figures 13 and 14 verify that our basic premise of BUT sensor experiencing smaller fluctuation and having an overall higher signal level was partially correct. Indeed, the BUT sensor's signal levels are approximately 30-dB higher than in the case of the city center module. However, the RSSI fluctuation is less pronounced for the city center module. Most of the samples for both sensors fit in the 20-dB range around the average signal level; however, most of the values lie within the 10-dB area for the city center sensor. For the BUT sensor however, this range is almost 20 dB.  In the case of SNR, depicted in Figures 15 and 16, the situation is slightly different. As expected, the BUT sensor indicates much better SNR values. Notably, almost 600 samples fit between 5 and 10 dB for the BUT sensor. Further, the SNR values of most of the samples are over 0 dB. Conversely, for the city center module, not a single sample has SNR better than 0 dB. On top of that, the SNR fluctuation of the city center module is significantly higher. The SNR values are more evenly distributed over the whole range of SNR values (around 10 dB). It is an exciting finding, as even in the case of the city center sensor, there are RSSI samples comparable to the BUT module. However, the SNR for these occasions is still significantly worse. Hence, it appears that SNR is more dependent on radio conditions and the overall separation distance between ED and GW than RSSI.

Data Descriptor Summary
This data descriptor paper provides the dataset descriptor of LoRaWAN technology measurements in the midsize city scenario of Brno, Czech Republic. The dataset covers 311 outdoor test locations in a varying landscape processed on 39 static gateway nodes. The dataset provides various measurement data ranging from communication to locationrelated information, providing the researchers with broad opportunities to analyze the deployment and improvement of the LoRaWAN operation for numerous IoT-use cases. In addition, we explore the long-term characteristics of the real-world LoRaWAN network based on a measurement campaign spanning over two months. Finally, we report on power consumption measurements conducted on an off-the-shelf LoRaWAN communication module using different SF settings and various message sizes. All mentioned datasets are publicly available at GitHub and will be extended after the collection of new, real-world measurement data.
Based on the conducted city-scale measurement campaign, it can be stated that Lo-RaWAN technology represents a solid communication technology for delay-and losstolerant application as it provides a cumulative packet delivery ratio of 83% during the whole experiment. Impressively, only 16 locations of 311 were not served, giving the overall outage probability of only 5.15%. Notably, the results suggest that the geographically closest GW does not always provide the best RSSI nor SNR. In reality, 40% of GWs with the best RSSI and 34% with the best SNR are not the closest ones.
The long-term measurements show exciting results. Even during the two months, the RSSI significantly fluctuated within the range of almost 50 dB. In terms of SNR, fluctuation ranged around 25 dB. These results suggest that the use of conventional empirical propagation models may lead to significant inaccuracies of the predicted path loss values. Hence, a more precise dynamic channels model will be needed. The provided data set may serve as a good starting point for the development of such a model.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the study's design; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.