Modeling and Analyzing Urban Sensor Network Connectivity Based on Open Data

Musznicki, Bartosz; Piechowiak, Maciej; Zwierzykowski, Piotr

doi:10.3390/s23239559

Open AccessArticle

Modeling and Analyzing Urban Sensor Network Connectivity Based on Open Data

by

Bartosz Musznicki

¹

,

Maciej Piechowiak

^2,*

and

Piotr Zwierzykowski

¹

Institute of Computer and Communication Networks, Faculty of Computing and Telecommunications, Poznań University of Technology, 60-965 Poznań, Poland

²

Department of Computer Science, Kazimierz Wielki University, 85-064 Bydgoszcz, Poland

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(23), 9559; https://doi.org/10.3390/s23239559

Submission received: 3 November 2023 / Revised: 28 November 2023 / Accepted: 29 November 2023 / Published: 1 December 2023

(This article belongs to the Special Issue Internet of Things (IoT) in Smart Cities and Urban Planning)

Download

Browse Figures

Versions Notes

Abstract

:

The optimization of network topology is crucial to achieve efficient data transmission in wireless sensor networks. Recently it has been proven that emerging open data sources can be used for modeling the structures of heterogeneous urban sensor networks. With this, leveraging real location data of various networked and sensing devices became feasible and essential. This approach enables the construction and analysis of more accurate representations based on frequently updated actual network infrastructure topology data, as opposed to using synthetic models or test environments. The presented modeling method serves as the basis for the designed architecture and implemented research environment. This paper introduces a set of algorithms which transform devices’ location data into graph-based wireless network connectivity models. Each algorithm is thoroughly discussed and evaluated. Moreover, static (momentary) and dynamic (time-spanning) network topologies are constructed in four large Polish cities based on publicly available data. Multidimensional simulation-based analysis is conducted to investigate the characteristics of the modeled structures. Directions for further research are suggested as well.

Keywords:

urban sensor networks; open data; graph modeling; connectivity analyzing

1. Introduction

Wireless sensor networks (WSNs) are usually imagined and designed as homogeneous structures, studied using synthetic models and computer simulations [1] or in experimental testbeds [2]. The research on some vehicular ad hoc networks (VANETs) and delay-tolerant networks (DTNs) was based on historical data obtained from transportation operators [3,4]. Another investigation used the readings gathered by a telecommunications operator in a proprietary urban mobile relay network [5]. In a world which is becoming increasingly networked, various new kinds of devices are connected in urban spaces, e.g., electricity meters, home automation and entertainment devices, trash bins, parking meters, etc. Some are designed specifically for sensing purposes while others are capable of performing different types of measurements in addition to their main functions. The boundaries between different types of hitherto studied networks are becoming blurred and their structures are becoming more heterogeneous.

Currently, new diverse online data sources are emerging. They include both ones that provide infrequently changing sets of data and ones that serve real-time data related to public transport vehicles and elements of urban infrastructure. More and more of these sources are available online and enable access to data related to, e.g., buses and trams, as well as public transport stops and ticket machines. Not only the geographic location of each element is available, but quite often also additional information, e.g., the type of the device, current running parameters, and recent values of the measurements. Due to the ongoing development of data storage, processing, and distribution technologies, these data can be made publicly available [6,7]. Therefore, further open data sources are expected to become available within the coming years. This opens up a range of unexploited research and development possibilities related to deterministically and randomly deployed nodes of sensing capabilities [8]. See the examples of such connected devices in cities in Poland in Figure 1.

This article builds on the original idea presented by the same team in [9] and extends it with a detailed study of the effectiveness of the proposed algorithms. It has been proven that open data can be used for modeling heterogeneous urban sensor networks. The actual types and features of these networks are reviewed and key routing research problems are defined. The characteristics of data sources are presented and different exemplary graphs are modeled to show the feasibility of the method and to indicate potential applications. Moreover, a practical network modeling architecture is introduced.

The next sections concretize and investigate the concept further. First, in Section 2, new urban sensor network connectivity modeling algorithms are presented and discussed. They include both static (momentary) and dynamic (time-spanning) graph-based network modeling methods. Then, Section 3 introduces the multidimensional simulation study’s methodology and architecture. Open data related to four Polish cities are used. Diverse geographic areas are defined and example modeled networks are presented. The results are thoroughly discussed in Section 4 and Section 5 to investigate the properties of each algorithm and modeled structure. Section 6 presents a summary of our findings.

2. Network Modeling Algorithms

The complete modeling flow, composed of algorithms introduced in the next subsections, is shown in Figure 2. Both static (space) and dynamic (space–time) realistic graphs can be generated to enable graph-based analysis of network topologies and routing algorithms of interest.

Time-changing graph representations and nomenclature were reviewed in [9] and can be referenced when needed. Based on the presented naming evolution, the terms slot, space graph, and space–time graph, as well as, e.g., space edge and time edge, will be used as the basis for naming modeled networks and their elements. Some graphs will be additionally termed time-expanded or time-aggregated to indicate if their form is layered or compacted.

The discussion of each algorithm is closed with a definition of time complexity. The usage of data structures implemented with hash tables is assumed and, therefore, average-case complexity

O (1)

applies for all basic data insertion, search, update, and deletion operations. These include, e.g., obtaining an element of a simple set, or accessing an element of a more advanced dictionary-like keyed structure. For this reason, the influence of this type of operation is not taken into account and the work centers on the presented time complexities in the very essence of the algorithms.

2.1. Network Device Data to Slots of Space Nodes

The first stage of modeling, as presented in Algorithm 1, network device data to slots of space nodes (NDD-SSN), is aimed at data quantization, i.e., the construction of a list of subsequent time

s l o t s

. The term comes from the research of Huang et al., where it was used to denote the space between consecutive layers of a space–time graph [10]. In the presented novel network modeling approach, these

s l o t s

are network topology snapshots that capture the deployment of modeled physical wireless network

d e v i c e s

in consecutive intervals of

s l o t L e n g t h

. Each time

s l o t

groups the

i n s t a n c e s

(occurrences) of all

d e v i c e s

in a network

a r e a

of interest considered to belong to a given

t i m e F r a m e

. Such a timeframe is defined for every network device

c l a s s

. The time distribution of data related to the devices is discrete. The two-dimensional

a r e a

is defined by two space-related closed intervals. As a result, each actual

d e v i c e

is represented as a

n o d e

in the

s l o t s

, and the

d e v i c e

is considered to have occurred. The

s l o t s

are sets of nodes only, i.e., the nodes are not connected with any edges yet, as presented in Figure 3. Stationary simple nodes are depicted with green circles, stationary advanced nodes are depicted with blue triangles, and mobile advanced nodes are blue triangles additionally marked with a black border.

Algorithm 1: Network device data to slots of space nodes.
Input:
$a r e a \leftarrow ([X_{m i n}, X_{m a x}], [Y_{m i n}, Y_{m a x}])$	`//` `network area of interest`
$c l a s s e s \leftarrow {(c l a s s_{i})}_{i \leftarrow 1}^{j} :$	`//` `list of j distinct device classes`
$c l a s s_{i} \leftarrow {d e v i c e_{k}}_{k \leftarrow 1}^{l} :$	`//` `set of l devices of class i`
$d e v i c e_{k}, \leftarrow ($	`//` `device k`
$i n s t a n c e s_{k}, \leftarrow {(i n s t a n c e_{m})}_{m \leftarrow 1}^{n} :$	`//` `list of n time instances of device k`
$i n s t a n c e_{m} \leftarrow ($	`//` `instance m`
$t i m e_{m},$	`//` `instance occurrence time`
$l o c a t i o n_{m}$	`//` `instance location`
$),$
$i d_{k},$	`//` `device id`
$r o l e_{k}$	`//` `device role`
)
$s l o t L e n g t h \in R_{> 0}$	`//` `length of time slots`
$t o p o l o g y L e n g t h \in N^{+}$	`//` `number of subsequent time slots of the time topology`
$t o p o l o g y S t a r t$	`//` `start time of the topology`
$w i n d o w s \leftarrow {(w i n d o w_{i})}_{i \leftarrow 1}^{j}$	`//` `list of j time window lengths corresponding to the respective j`
`device classes`
Output:
$s l o t s \leftarrow {(s l o t_{p})}_{p \leftarrow 1}^{t o p o l o g y L e n g t h} :$	`//` `list of subsequent time slots`
$s l o t_{p} \leftarrow {n o d e_{r}}_{r \leftarrow 1}^{s} :$	`//` `set of all s nodes of slot p`
$n o d e_{r} \leftarrow ($	`//` `node r of slot p`
$c l a s s_{r},$	`//` `node class`
$i d_{r},$	`//` `node id`
$l o c a t i o n_{r},$	`//` `node geographic location`
$r o l e_{r},$	`//` `node role`
$s l o t N u m b e r_{r}$	`//` `node slot number`
)
1 $s l o t s \leftarrow a s s i g n D e v i c e s I n s t a n c e s T o S l o t s ()$	`//` `assign devices instances to slots of nodes`
2 output slots

The time topology begins at a given

t o p o l o g y S t a r t

time and its span, i.e.,

t o p o l o g y L e n g t h

is defined by the number of subsequent time

s l o t s

of equal nonzero

s l o t L e n g t h

within the network modeling period. Each

d e v i c e

in the

a r e a

of interest is identified by a unique

i d

, performs the desired network

r o l e

, and belongs to one of the distinct

c l a s s e s

. A data-lookup

w i n d o w

of a given length, i.e., duration, is defined for each such

c l a s s

. Every time the occurrence of a

d e v i c e

is distinguished is considered a time

i n s t a n c e

of this

d e v i c e

. Every

i n s t a n c e

is marked with the occurrence

t i m e

and

l o c a t i o n

.

The Procedure assignDevicesInstancesToSlots starts with initializing the list of empty time

s l o t s

, setting

t o p o l o g y S t a r t

as the initial

s l o t E n d

time and obtaining the number of device classes j. Then, it iteratively defines the sets of nodes that belong to each

s l o t

. Every iteration begins with moving the current value of

s l o t E n d

time by

s l o t L e n g t h

. Then, for each

c l a s s

, it is checked which time

i n s t a n c e

of every

d e v i c e

in the network

a r e a

is the newest time occurrence of that

d e v i c e

in the

t i m e F r a m e

of interest. This

t i m e F r a m e

is defined as a half-open interval preceding

s l o t E n d

by the time length of the

w i n d o w

of the

c l a s s

of the

d e v i c e

. If

n e w e s t I n s t a n c e

is selected, i.e., an

i n s t a n c e

which satisfies the time conditions related to current

s l o t

and

c l a s s

, a new

n o d e

is added to the current

s l o t

. That

n o d e

is marked with the

c l a s s

,

i d

,

l o c a t i o n

,

s l o t N u m b e r

, and

r o l e

of the

i n s t a n c e

of the given

d e v i c e

. As a result, the list of

s l o t s

of the nodes is obtained. It has to be pointed out that a list is an ordered sequence of elements while a set is an unordered collection.

Due to nested the iterative nature of the algorithm, its upper bound of time complexity is related to the number of

s l o t s

(topology length), device

c l a s s e s

, the number of the devices in the largest

c l a s s

, as well as the maximum number of device

i n s t a n c e s

. Therefore, it can be defined as

O (| s l o t s | \cdot | c l a s s e s | \cdot | c l a s s |_{M A X} \cdot | i n s t a n c e s |_{M A X})

.

Procedure assignDevicesInstancesToSlots

2.2. Slots of Space Nodes to Space Connectivity List

When the data related to the network devices have been turned into subsequent time slots of nodes, the next step of the modeling network topology can take place. Therefore, Algorithm 2, slots of space nodes to space connectivity list (SSN-SCL), constructs the space connectivity list (SCL), i.e., the list of subsequent time-ordered directed space connectivity graphs (SCGs)—based on the list of

s l o t s

and the assumed

r a d i o C o v e r a g e

of the devices. Such an

S C L

was called an evolving graph by Ferreira [11]. The graphs are considered to reflect possible temporary connectivity of the network in the related

s l o t

, as shown in Figure 3. Dashed links are the ones originating in stationary nodes, while solid links are those starting in mobile nodes. Here,

r a d i o C o v e r a g e

can be a simple nonzero omnidirectional constant-range function, as well as an advanced model-based function which depends on, e.g., device class, state, radio transmission and reception capabilities, as well as propagation conditions. An example SCL with six

s l o t s

and nine

n o d e s

is presented in Figure 4. It is assumed that nodes of a simple type can only be the ends of the edges (receivers), while advanced nodes can be also the start (transmitters). The number of nodes varies among the graphs and reflects the changes in the number of space nodes in the area of interest over time.

Algorithm 2: Slots of space nodes to space connectivity list

The

b o u n d i n g R e g i o n

is used to reduce the number of inter-nodal distance computations. This is based on the observation that for complex

r a d i o C o v e r a g e

functions it is beneficial to perform the first rough neighbor

c a n d i d a t e

filtering in a less computationally intensive way. Afterwards, further precise calculations are performed only for the pairs of nodes that are close enough, and therefore, likely to be able to establish a connection—depending on the shape, size and center (location) of the

r a d i o C o v e r a g e

of those nodes. The simplest approach to determine this, when working with a spheroid-based coordinate system, is to find a quasi-rectangular projected circumscribed area of the

r a d i o C o v e r a g e

of the

n o d e

. In the most simplistic case of a flat network area and uniform omnidirectional

r a d i o C o v e r a g e

, the

b o u n d i n g R e g i o n

would be a circumscribed rectangle of a circle centered at the location of the node, with the radio range being the radius of this circle.

The algorithm begins by establishing the number j of time slots of the nodes, i.e., obtaining the cardinality of the list of

s l o t s

, and initializes the list of j empty SCG graphs. Then, iteratively, each

S C G_{i}

graph is filled with edges

S C E_{i}

and vertices

S C V_{i}

. This list of l vertices is simply the list of all nodes of

s l o t_{i}

. The edges

S C E_{i}

are determined in a more complex, and yet computational-complexity-optimized way.

For each

S C G_{i}

, an empty set of edges

S C E_{i}

is initialized. Then, the set of vertices (nodes)

S C V_{i}

is traversed. In every iteration, each not yet traversed vertex of

S C V_{i}

is iteratively considered a neighbor

c a n d i d a t e

. By this means, the number of computations can be limited. In other words, the candidates are the

l - k

nodes which succeed the current

n o d e

(the

k^{t h}

one) in the list of l nodes of

S C V_{i}

. This optimization can be applied because of the symmetric nature of the process and operations aimed both at the

n o d e

and

c a n d i d a t e

can follow.

If the

c a n d i d a t e

is located within the

b o u n d i n g R e g i o n

of the

n o d e

, then it is checked if the

n o d e

is placed within the

r a d i o C o v e r a g e

of the

c a n d i d a t e

, as well as the

c a n d i d a t e

being within

r a d i o C o v e r a g e

of the

n o d e

. When both conditions are met, the

d i s t a n c e

between the

n o d e

and the

c a n d i d a t e

is calculated. The method

e d g e D i s t a n c e

can be a simple computation of geographical distance between two nodes by the means of determining the great-circle distance on a sphere [12]. It can be also a more complex metric function, e.g., related to the minimum power needed to complete a single transmission to the neighboring node. Next, if the

n o d e

is a relay, then

E_{i}

is extended with an edge from the

n o d e

to a neighbor

c a n d i d a t e

. Similarly, an edge from a neighbor

c a n d i d a t e

to the

n o d e

is added to the

S C E_{i}

if the

c a n d i d a t e

is a relay. When all iterations are completed, the

S C E_{i}

and

S C V_{i}

of

S C G_{i}

are stored in the

S C L

.

The radio-connectivity-related functions

r a d i o C o v e r a g e

and

b o u n d i n g R e g i o n

are advised to be either precomputed for each

n o d e

of the

S C V_{i}

at the beginning of iteration i, or computed at the first usage and stored (cached) for future use, depending on the implementation. In the general urban use case, it can be assumed that the

r a d i o C o v e r a g e

and network area dimensions would be of different orders of magnitude. Therefore, not using

b o u n d i n g R e g i o n

, especially with a large number of highly dispersed nodes, would lead to a significant increase in the computational complexity. In cases where the

r a d i o C o v e r a g e

and dimensions of the network area tend to be of the same or similar orders of magnitude, especially in sparse networks, it might be beneficial to omit the computation and usage of

b o u n d i n g R e g i o n

. Similarly, the use of

b o u n d i n g R e g i o n

may be counter-effective if it is of a similar or higher computational complexity than checking if a node belongs to a region defined by

r a d i o C o v e r a g e

.

The upper bound of the complexity of the algorithm is

O (| s l o t s | \cdot {| S C V |}_{M A X}^{2})

, i.e., is related to the number of

s l o t s

(topology length) and the number of nodes in a graph with the largest number of nodes of all graphs of the topology.

2.3. Space Connectivity List to Space–Time Connectivity Graph

The construction of a space–time connectivity graph (STCG) is defined as a more complex Algorithm 3, space connectivity list to space–time connectivity graph (SCL-STCG), composed of several procedures. Based on the structure of each directed space graph

S C G_{i}

of

S C L

, time node instances as well as space–time edges are added to the

S T C G

. These edges are of two types, i.e., space, and time, which indicates their role in the structure. Space edges connect different neighboring space nodes (devices) while time edges connect consecutive time instances of the same space node. Moreover, the notion of intra-slot and inter-slot edges is introduced to distinguish their graph roles. Intra-slot edges, that can be of space and time types, are used to construct the graph structure related to spatiotemporal relations within a given slot of nodes (based on

S C G_{i}

). Inter-slot edges are of the time type and connect slot-related structures (stages) to produce a space–time connectivity graph. An

S T C G

is therefore a one-way multipath multistage layered structure that follows the direction of time and is a directed acyclic graph, as depicted in Figure 5.

Algorithm 3: Space connectivity list to space–time connectivity graph

To enable the usage of various algorithms, such as the well-known ones related to finding shortest paths or trees, to process the constructed

S T C G

as the input structure, the edges share the same set of attribute types, e.g.,

s p a c e D i s t a n c e

, and

t i m e D i s t a n c e

. These weights are computed or set based on defined unit costs. They are a means of modifying or tweaking the resulting cost structure to meet the needs of further modeling and analysis:

$i n t r a S l o t T i m e E d g e S p a c e U n i t C o s t$ —space unit cost of a time edge within a slot:
−
default value: 0;
−
meaning: nonzero value stands for unit space cost related to the node (device) operating within a time slot. It can be used, for example, to model the cost of receiving location beacons;
$i n t e r S l o t T i m e E d g e S p a c e U n i t C o s t$ —space unit cost of a time edge between slots:
−
default value: 0;
−
meaning: nonzero value stands for unit space cost related to the node (device) transitioning between time slots. It can be used, for example, to model the cost of transmitting location beacons;
$i n t r a S l o t S p a c e E d g e T i m e U n i t C o s t$ —time unit cost of a space edge within a slot:
−
default value: 0;
−
meaning: nonzero value indicates unit time cost related to transmitting a message between two devices, e.g., due to technology-dependent buffering or delays. It can be used, for instance, together with $i n t r a S l o t S p a c e E d g e T i m e U n i t C o s t$ to favor intra-slot time edges over intra-slot space edges by path-finding algorithms. It will lead to maximizing buffering time in a single relay, minimizing the number of inter-node transmissions, and hence, the nodes involved. Although, it can happen at the expense of an overall increase in the space cost of the constructed $S T C G$ ;
$i n t r a S l o t T i m e E d g e T i m e U n i t C o s t$ —time unit cost of a time edge within a slot:
−
default value: 0;
−
meaning: should be considered in relation to $i n t r a S l o t S p a c e E d g e T i m e U n i t C o s t$ for given modeling scenario. It can also be used with $i n t e r S l o t T i m e E d g e T i m e U n i t C o s t$ to shape time-path cost properties of $S T C G$ , e.g., as a tie-breaker;
$i n t e r S l o t T i m e E d g e T i m e U n i t C o s t$ —time unit cost of a time edge between slots:
−
default value: 1;
−
meaning: indicates unit cost related to transitioning (buffering) a message over time by a node. It is of key significance for path searching scenarios that aim to optimize the message delivery time, e.g., to minimize the total time cost of a path. If set to 0 it may lead to unexpected or erroneous results in optimization algorithms which are based on ordering the weights of the edges. It can be of use though when consciously used with properly selected values of $i n t r a S l o t S p a c e E d g e T i m e U n i t C o s t$ and $i n t r a S l o t T i m e E d g e T i m e U n i t C o s t$ .

Iterating over SCL, Procedure addTimeNodeInstancesAndEdges is invoked twice for

S C G_{1}

. The first, i.e., additional call, extends the sets of space–time connectivity edges (STCEs) and space–time connectivity vertices (STCVs) with time node instances and edges which represent the non-existent graph zero, as shown by Huang et al. [10]. Such an abstract graph

S C G_{0}

with no space edges is required to provide correct starting points for path- and tree-finding algorithms and enable traversals based on the space and time metrics of the graph. For the remaining

S C G_{i}

graphs, both space and time edges are constructed.

Procedure addTimeNodeInstancesAndEdges(i, SCG, STCE, STCV)

To add time node instances and edges in Procedure addTimeNodeInstancesAndEdges, each node of a given graph is used to make two new

S T C V

nodes, which represent the instances of the node at the “start” and “end” of time slot i. The creation of these nodes is defined in Procedure makeTimeNodeInstance. To initialize a new

t i m e N o d e I n s t a n c e

, first the attributes of

s p a c e N o d e I n s t a n c e

are copied. Then, the

i d

of the space node is stored as

g l o b a l I d

to keep the reference of time node instance to its parent space node. Next, a new

i d

is composed, i.e., the

i d

of the space node is prefixed with slot number i and a time instance type indicator, either “_s_” for slot start, or “_e_” for slot end. In this way, for example, node “1357” of slot (space connectivity graph) number 3 will be converted to slot start instance “3_s_1357”.

Procedure makeTimeNodeInstance(i, spaceNodeInstance, timeInstanceType)

Afterwards, following Procedure addTimeEdges, intraSlotTimeEdge and interSlotTimeEdge are added to

S T C E

. The directions are defined by the

s t a r t

and

e n d

node attributes and additional labels are set, i.e.,

s p a c e D i s t a n c e

,

t i m e D i s t a n c e

,

s l o t N u m b e r

, and

t y p e

set to “time”. Here,

t i m e D i s t a n c e

can mean, for example, the delay or buffering time related to traversing the edge by a message. Current slot node instances are connected with a time edge of

i n t r a S l o t T i m e E d g e S p a c e U n i t C o s t

and

i n t r a S l o t T i m e E d g e T i m e U n i t C o s t

. The current slot start instance is then linked with the newest slot end instance that exists in the set of globalIds, which is an attribute of

S T C V

. This does not always mean it is connected with the end instance of the previous slot. The node instance might have not been present in the directly preceding slot, or the space node has not yet been present in the space–time graph. Then, the slot end instance is added to the list of

e n d I n s t a n c e s

of

g l o b a l I d

in the

g l o b a l I D s

set related to the space nodes of

S T C V

. Next, in Procedure addSpaceEdges, each space edge of

S C G_{i}

is converted to intraSlotSpaceEdge and added to

S T C E

. Finally,

S T C E

and

S T C V

are used to compose the

S T C G

.

Procedure addTimeEdges(STCE, STCV, timeNodeInstances)

Procedure addSpaceEdges(i, SCG, STCE)

The upper bound of the time complexity of Algorithm 3 is

O (| S C L | \cdot {| S C V |}_{M A X} \cdot {| S C E |}_{M A X})

, and hence, is related to the number of space connectivity graphs, the number of nodes in the graph with the largest number of nodes, as well as the number of edges in the graph with the largest number of edges of all graphs of the topology.

The space–time connectivity graph (STCG) is an extension of the existing layered space–time graph (STG) concept [10,13]. It unambiguously reflects the space and time dimensions of changing network topology, and hence, enables multi-criteria spatiotemporal design and analysis. The essential innovations are the presented duplication of space nodes for each time slot as

s t a r t

and

e n d

node instances, as well as the introduction of intra-slot and inter-slot edges and metrics (e.g., the sets of

s_{1 s}

and

s_{1 e}

nodes and edges in Figure 5). They enable the development of new optimization algorithms and the proper usage of existing effective path-finding ones designed for graphs of more traditional time-flow-ignoring contexts, i.e., static as compared to dynamic (evolving) graphs. It is worth noting the term time-expanded graph, which was used in a related context [14]. In spite of that structure being an even more simplistic model, the term itself can additionally be of use in relation to space–time graphs because it captures and highlights the time-related graph structure span.

In [14], the model of a time-aggregated graph is presented. This uses single instances of each node and edges between them. The edges are labeled with occurrence times of each connection. This alternative representation of a spatiotemporal graph, defined as a space connectivity list visible in Figure 4, is presented in Figure 6a. There, the directed edge label

s_{1, 2, 5}

means that the link originated by a mobile node existed in time slots 1, 2, and 5. Similarly,

s_{1 - 6}

denotes that the connection between stationary nodes that were present throughout the whole time span of modeled network.

Time-aggregated representation is not used to model STCGs because it does not enable direct use of well-known graph optimization and analysis algorithms which provide optimal solutions. However, methods are being developed that are aimed at solving these problems in time-aggregated graphs. The problem of determining minimum temporal paths is addressed by the algorithms for finding earliest-arrival, latest-departure, fastest, and shortest paths [15]. Methods for constructing a directed Steiner tree (DST) in a structure that resembles a space–time graph transformed from a time-aggregated graph called a temporal graph are also presented [16]. Similarly, in [10], they aim to construct a DST directly in the space–time graph used in their topology control efforts.

Importantly, a time-aggregated graph is a representation well-suited to capture the outcomes of algorithms that solve problems in STCGs. Therefore, it is used in the present research as a practical representation of the modeled first-contact and multicast graphs. Please see Figure 6b,c for examples.

2.4. Space–Time Connectivity Graph to First-Contact Graph

To build a first-contact graph (FCG), Algorithm 4, space–time connectivity graph to first-contact graph (STCG-FCG), is used. Such a graph is a time-aggregated graph with single instances of all

s p a c e N o d e s

located at the coordinates of their first instances, i.e., the ones in

f i r s t S l o t

(first SCG) in which the node was present. Each node is connected to each neighbor with a directed first-contact space or time edge. Space edges connect the nodes that are neighbors in the same

f i r s t S l o t

. Time edges connect them otherwise.

Algorithm 4: Space–time connectivity graph to first-contact graph
Input:
$S T C G \leftarrow ($	`//` `directed space–time connectivity graph`
$S T C E,$	`//` `set of space–time connectivity edges`
$S T C V$	`//` `set of space–time connectivity vertices`
)
Output:
$F C G \leftarrow ($	`//` `directed first-contact graph`
$F C E,$	`//` `set of first-contact edges`
$F C V$	`//` `set of first-contact vertices`
)
1 $s p a c e N o d e s \leftarrow f i n d F i r s t I n s t a n c e s A n d T i m e N e i g h b o r s (S T C G)$
2 $F C G \leftarrow b u i l d F i r s t C o n t a c t G r a p h (S T C G . S T C V, s p a c e N o d e s)$
3 output FCG

An illustrative time-expanded

F C G

presented in Figure 7 was constructed in the

S T C G

introduced in Figure 5. The related time-aggregated form is depicted in Figure 6b. The presented edge labels, for instance,

s_{3}

and

s_{4}

, indicate in which time slot of a given

s l o t N u m b e r

the

f i r s t C o n t a c t E d g e

existed between two nodes. Similarly, an example space–time multicast graph was constructed and is presented in Figure 6c and Figure 8. It is a time-respecting tree that connects, over time, mobile source node

n_{1}

, via intermediate relay nodes, with four stationary destination nodes

n_{2}

,

n_{4}

,

n_{5}

, and

n_{9}

.

The STCG-FCG algorithm starts with finding first instances of space nodes and their first time neighbors in the

S T C G

. The Procedure findFirstInstancesAndTimeNeighbors iterates over each

n o d e I n s t a n c e

of the

S T C G

. At the beginning, it adds a new space node to the set of

s p a c e N o d e s

if it does not contain a node indexed with the

g l o b a l I d

of the current

n o d e I n s t a n c e

. Key attributes of the new space node are inherited from the current

n o d e I n s t a n c e

, i.e.,

g l o b a l I d

becomes

i d

, its

i d

is set as

f i r s t I n s t a n c e

and its

s l o t N u m b e r

becomes

f i r s t S l o t

. Moreover, an empty set of edges to successors is initialized. If a node indexed with

g l o b a l I d

was present in

s p a c e N o d e s

, then

f i r s t I n s t a n c e

and

f i r s t S l o t

are updated if the

s l o t N u m b e r

of the current

n o d e I n s t a n c e

is lower than

f i r s t S l o t

currently stored in

s p a c e N o d e s

for the current

g l o b a l I d

of interest. This means, that current

n o d e I n s t a n c e

precedes the instance which has been so far considered the

f i r s t I n s t a n c e

(occurrence) of the given

g l o b a l I d

(space node). The procedure closes with adding successors of current

n o d e I n s t a n c e

, which are instances of different space nodes. Its complexity is related to the number of nodes of the

S T C G

, and hence, upper bounded by

O (| S T C V |^{2})

.

Procedure findFirstInstancesAndTimeNeighbors(STCG)

When the set of

s p a c e N o d e s

is ready, Procedure buildFirstContactGraph can be used to construct the

F C G

out of the

S T C G

. At the beginning, all first node instances need to be added to the

F C G

. Therefore, for each

s p a c e N o d e

a

f i r s t C o n t a c t N o d e

is obtained. Such a node is the space node of the

S T C G

, which was determined to be the

f i r s t I n s t a n c e

of a given

s p a c e N o d e

. Its

i d

is set to the

i d

of

s p a c e N o d e

, and then, the

f i r s t C o n t a c t N o d e

can be added to the set of vertices

F C V

of the

F C G

. In the next step, the edges going out from each

f i r s t C o n t a c t N o d e

are added to the

F C G

. For each

n o d e

of the

F C G

, each

e d g e

going out from the current

n o d e

is evaluated. The end node of the edge is set to be the current

n e i g h b o r

of the

n o d e

. Then, the

t i m e D i s t a n c e

between them is calculated and

c o n t a c t E d g e

defined as any edge connecting a

n o d e

to a

n e i g h b o r

, based on their

i d

and

g l o b a l I d

, respectively. If such an edge does not exist in the

F C G

, then

e d g e T y p e

is set. It is “time” if

t i m e D i s t a n c e

is positive and “space” otherwise. Then,

f i r s t C o n t a c t E d g e

is defined and added to the

F C G

. If

c o n t a c t E d g e

exists in the

F C G

and its

t i m e D i s t a n c e

is larger than

t i m e D i s t a n c e

to the current

n e i g h b o r

instance, then the

t i m e D i s t a n c e

of the edge of the

F C G

is updated to the shorter

t i m e D i s t a n c e

.

As a result,

F C G

contains first instances of nodes labeled with space

i d

. Those nodes are connected to their neighbors with first-contact edges. The upper bound of this procedure’s time complexity is

O (| s p a c e N o d e s |^{2})

.

Procedure buildFirstContactGraph(STCV, spaceNodes)

3. Simulation and Analysis Methodology

The main objective of the simulation study is to enable a multi-criteria evaluation and comparison of the proposed models and algorithms. Due to unknown characteristics of the underlying urban infrastructure, numerous features of the constructed networks are also of interest.

The simulation environment was built as an extension of custom-made network modeling software [9] which implements the network modeling architecture presented in Figure 9. It is based on Linux, PostgreSQL, and Python, as well as on the NetworkX library that implements basic data structures and numerous standard graph-related operations [17]. The graphs are visualized using map data provided by OpenStreetMap [18]. The presented research environment has been implemented based on this software framework due to its ubiquity of use, detailed documentation, and broad community support. They provide numerous base functionalities used by graph researchers, hence have high popularity and proven value for data scientists.

The network modeling flow of the simulation follows a logical order in which the algorithms introduced in Section 2 are related to one another. The high-level steps of the simulation are depicted in Figure 10. The network topologies are constructed and analyzed as graphs. No actual radio propagation models, communication protocols, or power management mechanisms are simulated. By this means, technology and protocol agnosticism is ensured in all aspects. In this way, the efficacy and efficiency of the algorithms can be investigated and compared using graph theory methods. The key features and metrics of the modeled networks can be thoroughly analyzed as well.

3.1. Comparative Study Methodology

There exist no equivalent algorithms designed for modeling heterogeneous urban sensor networks. Therefore, the trends in the metrics of interest are compared and analyzed in relation to different urban areas, topology durations, and radio ranges. Space minimum and maximum spanning forests are constructed for each space connectivity graph undirected analog using Kruskal’s algorithm [19]. These forests are the generalized solutions of the problem of a time-sub-interval minimum spanning tree (TSMST) in a spatiotemporal network defined in [20]. Key metrics of the forests are compared to provide more insight into the momentary space connectivity topologies. Those forests are graphs composed of sets of trees built for every connected component of a space connectivity graph. The most informative node-related metrics of the constructed space–time connectivity graphs are compared to the related first-contact graphs. Further FCG-related parameters are gathered and investigated as well.

3.2. Statistical Analysis and Visualization

Statistical data are generated and gathered in the processes related to each step of the network modeling. Some of the metrics are calculated using custom-developed functions, while others are computed with the methods provided by the NetworkX graph modeling framework. The data are then represented as a pandas data structure called DataFrame [21]. In this way, advanced multidimensional data combining, filtering, categorization, and statistical processing is performed. Although, the values of individual data points and their numerical aggregates are not the center of attention in this study. The trends and relationships between the parameters of the networks and their metrics are the key concerns. Therefore, the analyzed data are visualized with the Seaborn data visualization library [22].

The data of interest are of a discrete nature, and therefore, the analyzed sets are presented in the form of scatter plots. The styles of the points represent different subsets of the data. Regression lines are overlaid on the plots to make the trends more visible in larger, denser, and overlapping data sets of a single chart, as in Figure 18. The sets of closely related subplots are grouped in named rows or columns of a single plot, e.g., the charts related to different graph types and radio ranges in Figure 23. Furthermore, pair plots are used to present the relationships between the sets of variables, as in Figure 26. A number of plots are categorical to group and shift the data horizontally around the values of interest, which makes the categories more distinguishable, such as city or knowledge mode. This kind of plot also introduces small jitter, i.e., random deviations, to horizontal distributions of the categories to make them more visible when there are multiple closely related values present. It does not change the values, i.e., the vertical distribution of the measurements, as visible in Figure 34. Due to the multidimensional nature of the data, the figures related to the subsets of metrics are also grouped and discussed in dedicated sections, e.g., in Section 4.1, focused on the space connectivity nodes’ parameters.

3.3. Simulation Data Sources and Node Classes

An investigation and analysis of the data sources discussed in [9] lead to the conclusion that sets of open data sources which meet the requirements of the study only exist for four Polish cities. Other cities provide limited scope or do not provide similar open data at all. The authors did not succeed in discovering equivalent open sets related to urban areas in other countries either. Therefore, Gdańsk, Poznań, Warsaw, and Wrocław, sources listed in Table 1, are used—being the ones that provide the data of comparable scope, granularity, and update frequency.

The geographic coordinates of the urban infrastructure elements extracted from the data are used in the presented comparative research and transformed using the introduced network modeling algorithms. In all four cities, the real-time locations of buses and trams are available. In Gdańsk and Poznań, the locations of public transport stops and ticket machines can be used. In Warsaw and Wrocław, ticket machines’ location data are not available. Although, in the case of Wrocław, the coordinates of city bike rental stations and parking lots (as shown in Figure 1a) of Vozilla (city electric car rental service) can be used instead. Most of the data are available in JavaScript Object Notation (JSON) format while the Poznań-related mobile nodes data are in Protocol Buffers (protobuf). The update frequencies of these data range from a few seconds in the cases of the continuously updated ones, to 10 and 20 s for Warsaw and Gdańsk, respectively. The numbers and locations of stationary nodes do not change that frequently. This means that even when the source updates the whole data set frequently, e.g., for air quality meters and city bike rental stations, the location-related data can remain unchanged for hours or days, just like for each 24 h of more infrequently updated ones. To enable heterogeneous structure modeling, the data sources were classified into three meaningful logical classes—mobile advanced, stationary advanced, and stationary simple. The nodes of each class are assigned a network role in the connectivity modeling scenarios:

mobile advanced class ⇒ mobile relays;
stationary advanced class ⇒ stationary relays;
stationary simple class ⇒ stationary destinations.

It is assumed that advanced nodes are the nodes with more significant computing, storage, communications, and power resources. Therefore, they are capable of performing complex delay-tolerant network (DTN)-forwarding operations. In contrast, simple nodes are the simple recipients of the communication.

3.4. Simulation Areas and Example Modeled Networks

The four cities of interest—Gdańsk, Poznań, Warsaw, and Wrocław—are among the largest and the most populated ones in Poland, as presented in the next paragraphs. Warsaw is the most populated urban area, more than two and a half times the population of Wrocław. Wrocław is twenty three percent more populated than Poznań, while the population of Poznań is thirteen percent larger than the one of Gdańsk. The population densities also differ in a related way—Gdańsk is the least densely populated urban area, followed by Poznań, Wrocław, and Warsaw, which is almost two times more densely populated than Gdańsk. Interestingly, in terms of the expected numbers of public transport routes (lines) that operate during the day, Poznań is the city with the lowest number, followed by slightly more routes in Gdańsk and Wrocław. In Warsaw, twice as many routes are present on average.

An area of 3 by 2 kilometers was selected in each of the cities. This choice was aimed at covering partially alike and partially distinct regions that include both the busy heart as well as less dense surroundings of each urban area. A closer examination of the city topologies, visible in the presented figures, reveals unique terrain, building, street, and infrastructure layouts. Therefore, the modeled networks are expected to indicate both different and similar features.

Example presented graphs constructed in those areas show the state modeled on Wednesday, 27 November 2019, at 3:00 p.m., when omnidirectional radio coverage is assumed. The relays are dark blue triangles and destination nodes are pink circles. The mobile nodes are the ones with black borders. Each node is presented in the location it occurred for the first time. The destination regions are marked as red dashed rectangles. A solid link depicts a space connection, i.e., one that occurs without message buffering (in the same time interval). Label 30 (24 s, 21 m) in Figure 14 indicates that the edge exists in slot number 30, the message has to be buffered for 24 slots in the relay before being forwarded, and that the space distance between the relay and the next hop node is 21 m.

3.4.1. Gdańsk

Population:
−
Total: 486 thousand [35] in the metropolis, of around one million in northern Poland;
−
Density: 1797 per km² [36];
Public transport day routes: around 80 [37];
Simulation area:
−
Latitude: 54.34398–54.36191;
−
Longitude: 18.62036–18.66666;
Example space connectivity graph in Figure 11:
−
Slot length: 6 s;
−
Radio range: 100 m;
−
Nodes: 121—mobile relays: 9, stationary relays: 20, stationary destinations: 92;
−
Average node degree: 2.45, edges: 148, space cost: 6177 m, connected components: 66.

3.4.2. Poznań

Population:
−
Total: 547 thousand [38] in the metropolis, of almost one million in west-central Poland;
−
Density: 2031 per km² [36];
Public transport day routes: around 70 [39];
Simulation area:
−
Latitude: 52.39853–52.41645;
−
Longitude: 16.88965–16.93389;
Example space minimum spanning forest in Figure 12:
−
Time interval: 6 s;
−
Radio range: 100 m;
−
Nodes: 171—mobile relays: 38, stationary relays: 17, stationary destinations: 116;
−
Average node degree: 1.20, edges: 103, space cost: 4904.00 m, connected components: 68.

3.4.3. Warsaw

Population:
−
Total: 1.794 million [40] in the metropolis, of 3 million in east-central Poland;
−
Density: 3469 per km² [36];
Public transport day routes: around 190 [41];
Simulation area:
−
Latitude: 52.22082–52.23879;
−
Longitude: 20.97058–21.01454;
Example space maximum spanning forest in Figure 13:
−
Slot length: 6 s;
−
Radio range: 100 m;
−
Nodes: 213—mobile relays: 49, stationary relays: 4, stationary destinations: 160;
−
Average node degree: 1.10, edges: 117, space cost: 8199.00 m, connected components: 96.

3.4.4. Wrocław

Population:
−
Total: 674 thousand [42] in the metropolis, of around 1.25 million in southwestern Poland;
−
Density: 2192 per km² [36];
Public transport day routes: around 85 [43];
Simulation area:
−
Latitude: 51.10015–51.11813;
−
Longitude: 17.01273–17.05570;
Example first-contact graph in Figure 14:
−
Slot length: 6 s;
−
Radio range: 100 m;
−
Nodes: 217—mobile relays: 145, stationary relays: 2, stationary destinations: 70;
−
Average node degree: 1.66, edges: 180, space cost: 3089.00 m, connected components: 120.

3.5. Simulation Architecture and Parameters

To enable multi-faceted modeling and analysis, the simulation architecture is based on object-oriented data structures implemented as a hierarchy of nested lists, presented in Figure 15. The simulations are executed with the parameters related to the algorithms which are the main steps in the modeling flow depicted in Figure 10. The key simulation scope characteristics (numbers), denoted with single capital letters, resulting from the architecture and parameters are also indicated:

Algorithm 1, network device data to slots of space nodes (NDD-SSN):
−
period: 27 November 2019 from 3:00 p.m. to 5:00 p.m.;
−
areas: 4 $\Rightarrow J = 4$ ;
*
$a r e a_{1}$ : $([54.34398, 54.36191], [18.62036, 18.66666])$ —Gdańsk;
*
$a r e a_{2}$ : $([52.39853, 52.41645], [16.88965, 16.93389])$ —Poznań;
*
$a r e a_{2}$ : $([52.22082, 52.23879], [20.97058, 21.01454])$ —Warsaw;
*
$a r e a_{2}$ : $([51.10015, 51.11813], [17.01273, 17.05570])$ —Wrocław;
−
topology lengths: (75, 150, 300, 600, 1200);
*
durations: (7.5 min, 15 min, 30 min, 60 min, 120 min) $\Rightarrow L = 5$ ;
*
topologies: (16, 8, 4, 2, 1) $\Rightarrow N = 31$ ;
−
slot length: 6 s $\Rightarrow S = 1200$ ;
−
classes: (mobile advanced, stationary simple, stationary advanced);
−
windows: (10 s, 24 h, 24 h);
−
relays: (mobile advanced, stationary advanced).
Algorithm 2, slots of space nodes to space connectivity list (SSN-SCL):
−
radio coverage: omnidirectional;
*
radio ranges: (25 m, 50 m, 100 m) $\Rightarrow Q = 3$ ;
*
space distance: great-circle distance between two nodes.
Algorithm 3, space connectivity list to space–time connectivity graph (SCL-STCG):
−
unit cost:
*
intra-slot time edge space unit cost: 0;
*
inter-slot time edge space unit cost: 0;
*
intra-slot space edge time unit cost: 0;
*
intra-slot time edge time unit cost: 0;
*
inter-slot time edge time unit cost: 1.

The node data sources for the simulation are grouped into three classes—mobile advanced, stationary simple, and stationary advanced, as introduced in Table 1. A data-lookup window is related with each of the classes. The widths of these were determined based on an analysis of the update frequencies of the sources. The data on mobile nodes are updated most frequently, i.e., as often as every few seconds. Therefore, a window of 10 s ensures that location changes will be reflected correctly in the modeled structures. Each node location is marked with the occurrence time. When the slot length is set to 6 s, a window of 10 s is also the means to correct brief node data or source outages—to avoid the node being missed in a single space connectivity graph when actually it was still present in the network. A longer window, in case of rapidly moving nodes and frequent data outages, may lead to misrepresentation of the node in its previous known location. Hence, connections may appear which, in reality, would not be possible to establish at that point in time since the node was, in fact, already at a different location. Also, the node could be able to establish links that were not modeled when the data were not available. Observation of the stationary nodes’ data leads to the conclusion that their location changes or is updated not more frequently than once every 24 h. Therefore, this interval is used as the window for fixed-node-related location data.

The simulations were conducted in four urban areas of interest based on the data gathered on Wednesday, 27 November 2019 between 3:00 p.m. and 5:00 p.m. This 2 h period was selected because it includes afternoon rush hours in the middle of a work week and enables coverage of all the desired modeling and analysis scenarios. In the modeling process, the period is divided in Algorithm 1 into topologies with durations defined by topologyLength. The values of interest are 75, 150, 300, 600, and 1200, which are the subset of a geometric sequence with a common ratio of 2. These are the numbers of space connectivity graphs in the space connectivity list of a given topology. They allow the modeling and study of network structures that are related, and yet, have different properties and time spans. Therefore, the trends connected to the matters of scalability and optimization can be investigated.

When the slotLength is set to 6 s, a series of topologies of 16, 8, 4, 2, and 1 space–time connectivity graphs are distinguished that last for 7.5, 15, 30, 60, and 120 min, respectively. The time of slotLength is expected to be sufficient to transmit the message between two neighboring nodes. Unlimited message storage (buffer) is assumed in each relay. Omnidirectional radio coverage is modeled for three effective radio ranges, i.e., 25 m, 50 m, and 100 m. These range limits are based on empirical observations that current popular sensing-related short-range wireless connectivity technologies tend to provide up to around 100 m range at higher throughputs in outdoor urban non-line-of-sight scenarios, depending on the transmit power [44] and data transmission parameters [45]. The space distance between two nodes is determined with the haversine formula, which computes the great-circle distance between two points on a sphere [12]. For each of the resulting graphs, minimum and maximum spanning forests are constructed.

Then, space–time connectivity graphs (networks) are modeled based on each of 31 space connectivity lists. The default values of the unit costs of Algorithm 3, SCL-STCG, are used. By this means, correct time-shortest paths can be determined using Dijkstra’s algorithm based on the time distance weight of the edges. The process set up in this way aims to balance the buffering time with message forwarding, and therefore, buffering resources use with transmission-related power consumption of the relays. The related first-contact graphs are constructed as well. Time distance in those graphs means how much time, i.e., time slots, has to pass before the node will be close enough to the neighboring node (device) to establish a connection.

The chosen simulation scope and parameters resulted in a total of 43,944 graphs being modeled, as listed in Table 2.

3.6. Simulation Study Metrics

The studied metrics are depicted in the next sections, first in absolute values, then some are presented as ratios (percentages) related to the reference ones. The metrics are listed in this section in relation to the first type of modeled structures they are discussed for. Other types may use the same or related metrics. The parameters are further divided into those that pertain to nodes (devices) and those that relate to edges (connections) of the graphs (networks). All of the used metrics are non-negative.

Space connectivity:
(a)
nodes:
stationary destination nodes—the number of stationary destinations;
stationary relay nodes—the number of stationary relays;
mobile relay nodes—the number of mobile relays;
mobile relay nodes to all nodes ratio—the percentage of mobile relay nodes as compared to the number of all nodes;
all nodes—total number of nodes;
connected components—the number of sets of nodes that are connected with each other by direct or indirect paths;
nodes per component—average number of nodes in a component;
(b)
edges:
average node degree—average number of edges adjacent to a node;
edges—total number of edges;
cost—the sum of space distances of all edges;
Space–time connectivity:
(a)
nodes
instances per node—the number of time nodes (instances) per unique device (space node)
instances per mobile relay node—the number of time nodes (instances) per unique mobile relay device (space node).

4. Space Connectivity Analysis

Following the objectives and parameters defined in Section 3, Algorithm 1, network device data to slots of space nodes (NDD-SSN) was used to select and group physical device (space nodes) data into time slots. Then, Algorithm 2, slots of space nodes to space connectivity list (SSN-SCL) was applied to construct space connectivity lists (SCLs) related to each city and modeled radio ranges. With 6 s slots and a 2 h period of interest, each SCL consists of 1200 subsequent space connectivity graphs (SCGs). Four cities and three radio ranges require twelve SCLs. This results in 14,400.00 space connectivity graphs with 2.3 million nodes and 1.6 million edges in total. Those space connectivity lists are both the object of the analysis in this section, as well as the key starting elements to construct the space–time connectivity graphs modeled in the next sections.

4.1. Space Connectivity Nodes

Each city network is characterized by a constant number of stationary destination nodes—Warsaw: 160, Poznań: 116, Gdańsk: 92, Wrocław: 70—as compared in Figure 16. The numbers of stationary relay nodes presented in Figure 17 are also constant but the decreasing order is quite different—Gdańsk: 20, Poznań: 17, Warsaw: 4, Wrocław: 2.

The mobile relay numbers are the ones that vary significantly across cities; they can be examined in Figure 18. These numbers also change over time in the areas of interest, although they oscillate around characteristic values—Warsaw: 74, Poznań: 39, Wrocław: 32, Gdańsk: 13. Only in the case of Warsaw can a distinctive slope increase in the regression line be observed. This shows that the number of mobile relays (vehicles) increases on average, which may be caused both by the need to address the increasing number of rush hours commuters, and by the traffic jams that may slow down the nodes. Warsaw also expresses the largest deviations from the regression line, especially for lower values. Conversely, in Gdańsk the numbers increase more dynamically but the maximum values, i.e., the number of public transport vehicles, drop by around 10 in about the last 150 graphs (15 min). The number of mobile relay nodes in Wrocław tends to be more evenly distributed while in Poznań the numbers are more tightly grouped.

The number of mobile nodes represents a different percentage of all the nodes in given city. What is visible in Figure 19 is that the relative numbers are substantially lower in Gdańsk, while Warsaw and Wrocław tend to be close and characterized by the highest ratios, with Poznań falling slightly behind—Warsaw: 30, Wrocław: 30, Poznań: 23, Gdańsk: 9. The sums of the respective stationary node numbers are the lower bounds of the total numbers of nodes in SCL compared in Figure 20. The deviations, and hence the maximum values, are related to the number of mobile relay nodes changing over time—Warsaw: 237, Poznań: 172, Gdańsk: 125, Wrocław: 104.

4.2. Space Connectivity Edges

While the numbers of nodes are shared by the networks in the given areas of all of the investigated radio ranges, other graph metrics vary. It is clearly visible in the graphs of Figure 21 that as the radio range increases, the number of connected components decreases. The topologies in Wrocław are the most sparse, while in other cities the average number of nodes per connected component depends more on the radio range, as compared in Figure 22. The number of average node degrees in Figure 23 and number of edges in Figure 24 increase with radio range. This follows the intuitive understanding that with increasing radio range, the resulting network will be less fragmented. Moreover, in different cities the changes in node radio range influence the structure of the network to different extents. This is related to the unique infrastructure topology and mobility patterns of each area. It is visible in particular when the changes in Warsaw and Wrocław are compared. Also, the average node degrees and edge numbers suggest that nodes in Poznań and Warsaw are denser and evenly distributed.

Moreover, the parameters of minimum spanning forests constructed in space connectivity graphs are influenced by increasing radio range in different way in different cities. When the networks are less fragmented, the number of edges in such trees, as well as the average node degrees, increase. Albeit, they do not influence the distributions in the same way, as visible in the bottom row graphs of Figure 23 and Figure 24. Those figures do not present the parameters of maximum spanning forests since the numbers of edges and average node numbers were the same as for minimum spanning forests. The costs of the structures are different though, as presented side-by-side in Figure 25. There, the larger the radio range, the greater the cost and the difference between the costs of the minimum and maximum spanning forests.

4.3. Space Connectivity Relationships

Figure 26 presents the relationships between multiple variables of the space connectivity graphs when the radio range was set to 50 m. It has already been shown that each range results in unique topological features of the networks. Although, this particular range, being the middle one of the simulated values, was selected to give an overview of the general differences in characteristics of the urban areas (cities) under scrutiny. One set of parameters, i.e., the columns, changing along the horizontal axis of the figure, covers the slot (graph) number and the number of mobile relay nodes, as well as the number of edges. The other set, the rows, changing along the vertical axis of the figure, covers the numbers of mobile relay nodes, the numbers of edges, graph costs, average node degrees, and the number of connected components.

Both presented histograms, i.e., the number of mobile relay nodes and the number of edges, suggest that each urban area and infrastructure has its own features and parameters. Almost every other related graph proves this even more strongly by presenting mostly disjoint groups of measurements for each city. They display, though, a degree of correlation in terms of the increasing trends they follow. Interestingly, it is only in Warsaw that the number of connected components clearly decreases with the increase in the number of mobile relay nodes and edges. In Wrocław, the increases in the number of mobile relay nodes and in the number of edges are followed by an increase in the connected components. At first glance, one could think that this means that despite becoming more populated with mobile relays, the structure becomes more disconnected. Here, the situation is quite different, and when more mobile relay nodes are present, more nodes can be connected, and yet they form new disconnected components rather than becoming connected to the larger ones. The conditions in Poznań may seem even less intuitive, with the number of connected components increasing with the number of mobile relays and decreasing with the increase in the number of edges. This means that with more graph edges, the nodes are, on average, connected in larger groups (components). In this urban area, more mobile relays, like in Wrocław, cause more nodes to be connected and constitute more smaller groups. In Gdańsk, which is the city with the lowest number of mobile relay nodes, those numbers and slope features are influenced only to a small extent, but with noticeable deviations from the trend. The highest direct correlation and overlapping is present between the number of edges and the costs of the trees in all of the cities. The relationship is almost linear, with little deviation, which suggests that the average edge costs are similar. The average node degrees differ more, and yet, depend on the number of mobile relays in a way resembling the relationship between the number of mobile relay nodes and edges. In terms of edges to average node degree relationship, the city-related distributions appear as distinctive linearly condensed groups.

To investigate the features in more detail, the space connectivity parameters of each city are presented and analyzed for all radio ranges separately in the next subsections.

4.3.1. Space Connectivity in Gdańsk

Taking a closer at the parameters of the space connectivity graphs in Gdańsk in Figure 27 reveals that the trends related to the connected components are linked not only with the number of mobile relay nodes and edges but also with the radio range. When the range increases from 25 to 50 m, the slope of the regression line changes from increasing to almost horizontal; when the radio range extends as far as 100 m it begins to decrease. This tendency is also influenced by the largest number of stationary relays of all studied cities, as visible in Figure 17. This means that with more radio range more nodes could be reached and connected to make larger components of the graph. The average node degrees, and hence, complexity and costs, of such structures are significantly higher. It cannot be overlooked that in Gdańsk usually only a few mobile relays were present, which is about one order of magnitude less than in other cities. The distribution of the numbers of edges is correlated with the distribution of mobile relay nodes, becoming more flattened and shifted in the direction of higher values as the radio range increases. It should be noted that a similar situation occurs in the relationship between the number of edges and average node degree.

4.3.2. Space Connectivity in Poznań

The space connectivity parameters in the networks modeled in Poznań and compared in Figure 28 at first seem similar to the ones in Gdańsk. They share common features, but a number of the characteristics are quite distinctive. The first difference is the much larger number of mobile relays in Poznań. The distributions of edge numbers are more Gaussian and shifted as the radio range increases. The average node degrees and graph costs are comparable in nature to Gdańsk. The slope of the trend of connected components in relation to edges is decreasing in all the range cases. This shows that, even at the lowest radio range, the numbers of mobile relays are high enough and their routes coincide with other nodes at a level which makes the network more connected and dense. The situation is also influenced by seventeen stationary relays, as introduced in Figure 17. It can be said that Poznań is the city with a topology which is the easiest one in which to achieve a high level of connectivity at the lowest cost.

4.3.3. Space Connectivity in Warsaw

The average number of mobile relay nodes in Warsaw is almost six times higher than in Gdańsk. Although, the distributions of the metrics of the Warsaw space connectivity graphs in Figure 29 share more similarities with Gdańsk than with Poznań. In Warsaw, the numbers of connected components, edges, and graph costs are twice as high as the ones in Gdańsk. Moreover, most of the metrics increase over time. The average node degree is the metric that stays at a similar level in both cities, being more compressed in Warsaw. Importantly, despite having the largest number of mobile relays, for each radio range, there is a number of graphs with only a few edges. Hence, stationary network nodes are more distributed and disconnected on their own than in Gdańsk and Poznań. Only four stationary relay nodes (see Figure 17) do not improve the connectivity enough. Although, the isolated stationary nodes become connected owing to the largest, and increasing over time, number of mobile relay nodes. This results in the highest numbers of edges of all the cities, as well as the largest graph costs.

4.3.4. Space Connectivity in Wrocław

In Figure 30, it is striking that Wrocław is the only city in which, in spite of tens of mobile relays, there are numerous graphs with no edges. This means that stationary nodes are heavily disconnected, especially when the radio range is at its lowest. Moreover, many of them are located beyond the routes and range of mobile relays. Wrocław is also the city with the lowest number of stationary relays, with only two nodes of this kind (see Figure 17). On the one hand, despite Wrocław being the city with the lowest overall numbers of nodes, it has around 30% of mobile relays, on a par with Warsaw. This is the largest percentage among the cities under scrutiny, as presented in Figure 19. On the other hand, the exact number of mobile relay nodes is comparable to Poznań, which has quite different distributions related to the connected components. In the edges to connected components relation, the trend is highly increasing at lower radio ranges, becoming almost vertical for the largest range. This stresses the uniqueness of both the topology of the area and the networked infrastructure deployment.

5. Space–Time Connectivity Analysis

Based on the space connectivity graphs analyzed in Section 4, space–time connectivity graphs (STCGs) were constructed using Algorithm 3, space connectivity list to space–time connectivity graph (SCL-STCG). Then, each STCG was used to build the related first-contact graph (FCG) with Algorithm 4, space–time connectivity graph to first-contact graph (STCG-FCG). While an STCG, which is a time-expanded graph, is mostly an intermediate network modeling structure, an FCG is a time-aggregated graph with a more practical meaning. The analysis of the FCG provides a general and easier to comprehend impression of how the space–time network changes in real environment and how the adjacencies occur for the first time. Therefore, the results for 372 graphs of both space–time types are presented and discussed side by side.

The graphs were constructed in four cities, at three radio ranges, and for five durations that divide the period of interest into respective networks. Each consecutive duration is twice the preceding one. Therefore, due to this geometric growth nature, the increasing trend visible in Figure 31 for the connectivity graph which seems exponential is in fact of a rather linear nature. Conversely, it looks linear in the case of the first-contact graph, and hence, it is of a more logarithmic type.

Since the numbers of stationary nodes were constant in the space connectivity graphs, in the respective space–time connectivity graphs they also remain at the same level in each area and duration. The numbers of mobile relay nodes presented in Figure 32 are varying, resulting in correlated changes in the overall number of nodes in the graphs. This is caused by the fact that not only the number of mobile relays changes over time but also the number of mobile nodes (the devices) present in the area changes. Moreover, some leave the area while others enter at different points in time. In the spatiotemporal network there are more unique nodes (devices) than in each single space connectivity graph. This is the reason why the number of mobile relays in first-contact graphs are at least a few times higher than in the space connectivity graphs in Figure 18. This number hardly exceeds 200 mobile relay nodes in Warsaw for the shortest duration. For the longest one, the number reaches 660 mobile relays. In other areas the numbers are lower. Interestingly, Poznań-related STCGs consist of more mobile nodes than the ones related to Wrocław. In FCGs, due to the structure of the connections, the situation is the opposite. Gdańsk remains the area with the lowest number of mobile relay nodes for all durations.

Importantly, the STCG construction algorithm is the reason for large numbers of nodes being present in space–time connectivity graphs, since each node of each SCG is represented by two instances in the STCG. This means that for each unique device (space node) in the observation area and duration, multiple instances (time nodes) are present in the space–time connectivity graph, as depicted in Figure 33. For example, the number of nodes for the 30 min duration in the STCGs ranges from around 75 thousand for Gdańsk to more than 140 thousand for Warsaw. These correspond to around 240 nodes for Gdańsk and 540 nodes for Warsaw in the related first-contact graphs. The numbers of instances of mobile relays are the highest in Poznań and Wrocław. This means that in those areas of interest particular mobile relays are on average present for the longest total time. Figure 33 shows the numbers only for the space–time connectivity graphs because in the first-connectivity graphs each actual device is represented by a single node.

In terms of the connected components in the first-contact graphs, see Figure 34, their numbers for Gdańsk, Poznań, and Warsaw are lower than they are on average in the space connectivity graphs compared in Figure 21. This difference grows when the network duration and radio ranges increase. The same trend can be observed in Wrocław but there is a difference. At the shortest durations and the smallest radio range, the numbers of connected components exceed the ones in the SCGs. This shows that the infrastructure in Wrocław is more fragmented and more time is needed to make the network more connected.

The opposite trends can be observed in relation to the numbers of nodes per component in Figure 35 and the average node degrees in Figure 36, when compared to Figure 22 and Figure 23, respectively, which are related to these parameters of the space connectivity graphs. The much higher values of the space–time metrics are of key significance, because they prove that when mobile relay nodes are used to build a space–time network, the momentarily (temporarily) disconnected components (parts of the network) are connected over time and, as a result, larger and more dense time-spanning networks are constructed. Wrocław falls behind considerably, but it needs to be taken into account that the underlying space connectivity graphs were of the lowest average node degree as well.

6. Summary

This paper introduces new network connectivity modeling algorithms designed for realistic heterogeneous urban sensor networks. The presented methods use emerging publicly available data sources which provide the locations of different elements of urban infrastructure, public transportation vehicles, etc. Other types of related information are usually available as well. The family of related algorithms is presented as a set of commented pseudocodes, examples, and clarifications. A multidimensional simulation architecture has been proposed and used to construct static (momentary) and dynamic (time-changing) topologies, i.e., space connectivity graphs, space–time connectivity graphs, and first-contact graphs, in selected areas of four large Polish cities—Gdańsk, Poznań, Warsaw, and Wrocław.

Key observations related to space connectivity analysis:

Large-scale network modeling with space connectivity graphs (SCGs) and multidimensional analysis can be conducted based on open data;
The unique topology and infrastructure features of each urban area influence the networks that are constructed;
When conducting analyses of the data obtained for individual cities, it can be seen that the highest overall number of nodes does not necessarily correspond to the population density of the individual cities.
Mobile relays are present in the areas of interest in varying numbers and with varying distributions, which depend on the infrastructure of the cities concerned, including the number of daily public transport lines, vehicle frequency, street topology, and density of stops;
Changing the radio range of the nodes affects the modeled networks differently, depending on the distribution of the nodes. Immediate topologies are more fragmented for smaller radio ranges. In more fragmented areas, increased radio coverage is required to connect more nodes;
The radio coverage affects the costs of space connectivity graphs and their associated minimum and maximum spanning forests. Studies have shown that as radio coverage increases, the number of edges and the cost of structures also increase. These increases are exponential, so the radio parameters of the designed networks must be carefully planned. In this way, excessive use of the wireless medium, as well as the computing and storage resources of the nodes, can be avoided;

Key observations related to space–time connectivity analysis:

Space–time connectivity graphs (STCGs), which consist of as many as hundreds of thousands of nodes, can be modeled based on space connectivity lists (SCLs) and used as the intermediate structure in space–time network modeling;
First-contact graphs (FCGs) are valuable compact-form indicators that capture how space–time networks develop over time and how the adjacencies occur for the first time;
The structures become more complex when network duration and radio coverage grows. This means that topology and connectivity increase as well;
The sets of mobile nodes present in the areas change over time—some enter and some leave at different moments. In a space–time connectivity graph, there are more unique space nodes (individual devices) than in a single space connectivity graph. As a result, the number of mobile relays in first-contact graphs is at least a few times higher than in the related space connectivity graphs;
The average node degrees and the number of connected components are much higher in first-contact graphs than they are in space connectivity graphs. This proves that using mobile relays to construct space–time networks enables the momentarily disconnected parts of a network to be connected over time. As a result, larger and denser time-spanning topologies are constructed.

Main contributions of presented work:

Urban sensor network graph-based modeling algorithms:
−
space connectivity modeling;
−
space–time connectivity modeling;
−
first-contact modeling;
Simulation study of introduced models and algorithms:
−
methodology for multidimensional modeling and analysis;
−
simulation architecture and custom-developed environment;
−
comparative investigation and observations for four Polish cities.

Suggested directions for further research:

Urban sensor network modeling:
−
Development of a reliable parametric network topology generator based on long-term observations of open data and timetables;
−
Construction of movement traces based on gathered node location open data;
−
Use of introduced modeling architecture as an element of stationary relay deployment planning;
−
Use of presented modeling algorithms in other fields:
*
social trends analysis;
*
multi-criteria route optimization for public service vehicles;
*
urban planning of bicycle paths, street infrastructure, etc.;
Graph-based study of presented algorithms:
−
Investigation of extended scale and scope:
*
more data sources, including the closed ones, when available;
*
more areas of different locations and sizes, e.g., suburbs, small cities, countrysides, etc.;
−
Advanced radio connectivity modeling:
*
use of actual connectivity data, when available;
*
use of complex radio coverage models.

In total, nearly 44 thousand graphs were built and enabled the study of the modeled networks. Both well-known and newly introduced graph-theory metrics were presented. The key features, trends, and relationships were analyzed, compared, and discussed, showing the usability of the introduced urban sensor network modeling algorithms. Further research directions were presented.

Author Contributions

B.M., M.P. and P.Z.: validation, writing—review and editing; B.M. and P.Z.: conceptualization; B.M.: data curation, formal analysis, investigation, methodology, resources, software, visualization, writing—original draft; P.Z.: funding acquisition, project administration, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Polish Ministry of Science and Higher Education (No. 0313/SBAD/1310) and was also supported in part by the grant to maintain research potential of Kazimierz Wielki University (Ministry of Education and Science, grant 2023).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Musznicki, B.; Zwierzykowski, P. Survey of Simulators for Wireless Sensor Networks. Int. J. Grid Distrib. Comput. 2012, 5, 23–50. [Google Scholar]
Murty, R.N.; Mainland, G.; Rose, I.; Chowdhury, A.R.; Gosain, A.; Bers, J.; Welsh, M. CitySense: An Urban-Scale Wireless Sensor Network and Testbed. In Proceedings of the 2008 IEEE Conference on Technologies for Homeland Security, Waltham, MA, USA, 12–13 May 2008; pp. 583–588. [Google Scholar] [CrossRef]
Bonola, M.; Bracciale, L.; Loreti, P.; Amici, R.; Rabuffi, A.; Bianchi, G. Opportunistic communication in smart city: Experimental insight with small-scale taxi fleets as data carriers. Ad Hoc Netw. 2016, 43, 43–55. [Google Scholar] [CrossRef]
Dias, D.S.; Costa, L.H.M.; de Amorim, M.D. Data offloading capacity in a megalopolis using taxis and buses as data carriers. Veh. Commun. 2018, 14, 80–96. [Google Scholar] [CrossRef]
Musznicki, B.; Kowalik, K.; Kołodziejski, P.; Grzybek, E. Mobile and Residential INEA Wi-Fi Hotspot Network. In Proceedings of the 13th International Symposium on Wireless Communication Systems (ISWCS 2016), Poznań, Poland, 20–23 September 2016. [Google Scholar]
Biljecki, F. A Global Feature-Rich Network Dataset of Cities and Dashboard for Comprehensive Urban Analyses. Sci. Data 2023, 10, 1–15. [Google Scholar]
Peixoto, J.P.J.; Bittencourt, J.C.N.; Jesus, T.C.; Costa, D.G.; Portugal, P.; Vasques, F. Exploiting geospatial data of connectivity and urban infrastructure for efficient positioning of emergency detection units in smart cities. Comput. Environ. Urban Syst. 2024, 107, 102054. [Google Scholar] [CrossRef]
Musznicki, B. Empirical Approach in Topology Control of Sensor Networks for Urban Environment. J. Telecommun. Inf. Technol. 2019, 1, 47–57. [Google Scholar] [CrossRef]
Musznicki, B.; Piechowiak, M.; Zwierzykowski, P. Modeling Real-Life Urban Sensor Networks Based on Open Data. Sensors 2022, 22, 9264. [Google Scholar] [CrossRef] [PubMed]
Huang, M.; Chen, S.; Zhu, Y.; Xu, B.; Wang, Y. Topology Control for Time-Evolving and Predictable Delay-Tolerant Networks. In Proceedings of the 2011 IEEE Eighth International Conference on Mobile Ad-Hoc and Sensor Systems, Valencia, Spain, 17–22 October 2011; pp. 82–91. [Google Scholar]
Ferreira, A. On models and algorithms for dynamic communication networks: The case for evolving graphs. In Proceedings of the ALGOTEL 2002, Mèze, France, May 2002. [Google Scholar]
Robusto, C.C. The cosine-haversine formula. Am. Math. Mon. 1957, 64, 38–40. [Google Scholar] [CrossRef]
Merugu, S.; Ammar, M.H.; Zegura, E.W. Routing in Space and Time in Networks with Predictable Mobility; Technical Report; Georgia Institute of Technology: Atlanta, GA, USA, 2004. [Google Scholar]
George, B.; Shekhar, S. Time-aggregated graphs for modeling spatio-temporal networks. In Journal on Data Semantics XI; Springer: Berlin/Heidelberg, Germany, 2008; pp. 191–212. [Google Scholar]
Wu, H.; Cheng, J.; Huang, S.; Ke, Y.; Lu, Y.; Xu, Y. Path Problems in Temporal Graphs. Proc. VLDB Endow. 2014, 7, 721–732. [Google Scholar] [CrossRef]
Huang, S.; Fu, A.W.C.; Liu, R. Minimum Spanning Trees in Temporal Graphs. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD’15), New York, NY, USA, 31 May–4 June 2015; pp. 419–430. [Google Scholar] [CrossRef]
NetworkX. Network Analysis in Python. Available online: https://networkx.org (accessed on 20 October 2023).
OpenStreetMap. Available online: https://www.openstreetmap.org/copyright (accessed on 20 October 2023).
Kruskal, J.B. On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 1956, 7, 48–50. [Google Scholar] [CrossRef]
Gunturi, V.; Shekhar, S.; Bhattacharya, A. Minimum Spanning Tree on Spatio-Temporal Networks. In Database and Expert Systems Applications; Bringas, P.G., Hameurlain, A., Quirchmayr, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 149–158. [Google Scholar]
McKinney, W. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; pp. 56–61. [Google Scholar]
Waskom, M.L. Seaborn: Statistical data visualization. J. Open Source Softw. 2021, 6, 3021. [Google Scholar] [CrossRef]
Open Gdańsk. GPS Positions of the Vehicles. Available online: https://ckan.multimediagdansk.pl/dataset/tristar/resource/0683c92f-7241-4698-bbcc-e348ee355076 (accessed on 20 October 2023).
Open Gdańsk. List of Bus Stops. Available online: https://ckan.multimediagdansk.pl/dataset/tristar/resource/4c4025f0-01bf-41f7-a39f-d156d201b82b (accessed on 20 October 2023).
Open Gdańsk. Positions of Ticket Machines. Available online: https://ckan.multimediagdansk.pl/dataset/tristar/resource/af7bf4a9-e62e-4af2-906a-fa27c2532dfd (accessed on 20 October 2023).
ZTM Poznań. For Developers—GTFS-RT. Available online: https://www.ztm.poznan.pl/pl/dla-deweloperow/gtfsRtFiles (accessed on 20 October 2023).
Poznań. Positions of Public Transport Stops. Available online: http://www.poznan.pl/mim/plan/map_service.html?mtype=pub_transport&co=cluster (accessed on 20 October 2023).
Poznań. Positions of Ticket Machines. Available online: http://www.poznan.pl/mim/plan/map_service.html?mtype=pub_transport&co=class_objects&class_id=4000 (accessed on 20 October 2023).
Warsaw Open Data. Public Vehicle Positions—API Documentation. Available online: https://api.um.warszawa.pl/files/9fae6f84-4c81-476e-8450-6755c8451ccf.pdf (accessed on 20 October 2023).
Warsaw Open Data. Available online: https://api.um.warszawa.pl (accessed on 20 October 2023).
Wrocław Open Data. Positions of Public Transporation Vehicles. Available online: https://www.wroclaw.pl/open-data/dataset/lokalizacjapojazdowkomunikacjimiejskiejnatrasie_data (accessed on 20 October 2023).
Wrocław Open Data. Wrocław City Bike Stations. Available online: https://www.wroclaw.pl/open-data/dataset/nextbikesoap_data/resource/42eea6ec-43c3-4d13-aa77-a93394d6165a (accessed on 20 October 2023).
Wrocław Open Data. Vozilla—City Electric Car Rental—Parking Lots. Available online: https://www.wroclaw.pl/open-data/dataset/wykaz-miejsc-parkingowych-miejskiej-wypozyczalni-samochodow-elektrycznych-vozilla (accessed on 20 October 2023).
Airly Developer. Documentation. Available online: https://developer.airly.org/en/docs (accessed on 20 October 2023).
Gdańsk w Liczbach. Liczba Mieszkańców Gdańska. Available online: https://www.gdansk.pl/gdansk-w-liczbach/mieszkancy,a,108046 (accessed on 20 October 2023).
Geoportal Krajowy Na Mapie. Available online: https://geoportal-krajowy.pl (accessed on 20 October 2023).
Gdańsk Municipal Transport Authority. Timetables. Available online: https://ztm.gda.pl/rozklady (accessed on 20 October 2023).
Poznan.pl. Znamy Liczbę Mieszkańców Poznania. Available online: https://www.poznan.pl/mim/info/news/znamy-liczbe-mieszkancow-poznania,188075.html (accessed on 20 October 2023).
Poznań Municipal Transport Company. Timetable. Available online: https://www.mpk.poznan.pl/en/timetable/ (accessed on 20 October 2023).
Statystyka Warszawy. Miasto Warszawa. Available online: https://um.warszawa.pl/statystyka-warszawy-2022 (accessed on 20 October 2023).
Warsaw Public Transport. Timetables. Available online: https://www.wtp.waw.pl/en/timetables/ (accessed on 20 October 2023).
Statistical Office in Wroclaw. Population. Available online: https://wroclaw.stat.gov.pl/en/zakladka2/ (accessed on 20 October 2023).
Wrocław Municipal Transport Company. Timetable. Available online: https://www.wroclaw.pl/komunikacja/rozklady-jazdy (accessed on 20 October 2023).
Karvonen, H.; Pomalaza-Ráez, C.; Mikhaylov, K.; Hämäläinen, M.; Iinatti, J. Experimental Performance Evaluation of BLE 4 versus BLE 5 in Indoors and Outdoors Scenarios. In Proceedings of the Advances in Body Area Networks I; Springer International Publishing: Berlin/Heidelberg, Germany, 2019; pp. 235–251. [Google Scholar]
Ferreira, A.E.; Ortiz, F.M.; Costa, L.H.M.; Foubert, B.; Amadou, I.; Mitton, N. A study of the LoRa signal propagation in forest, urban, and suburban environments. Ann. Telecommun. 2020, 75, 333–351. [Google Scholar] [CrossRef]

Figure 1. Examples of connected devices in cities in Poland. (a) Electric rental car and parking lot in Wrocław in December 2019; (b) bus, scooter, and bike rental station in Poznań in August 2019; (c) electric kick scooters in Gdańsk in November 2019.

Figure 2. Network modeling flow.

Figure 3. Space connectivity graph modeling example.

Figure 4. Example space connectivity list.

Figure 5. Example space–time connectivity graph.

Figure 6. Example time-aggregated graphs.

Figure 7. Example expanded first-contact graph.

Figure 8. Example expanded space–time multicast graph.

Figure 9. Open-data-based network modeling architecture [9].

Figure 10. Simulation modeling flow.

Figure 11. Example space connectivity graph in Gdańsk.

Figure 12. Example space minimum spanning forest in Poznań.

Figure 13. Example space maximum spanning forest in Warsaw.

Figure 14. Example first-contact graph in Wrocław.

Figure 15. Simulation architecture.

Figure 16. Stationary destination nodes in space connectivity graphs.

Figure 17. Stationary relay nodes in space connectivity graphs.

Figure 18. Mobile relay nodes in space connectivity graphs.

Figure 19. Ratio of mobile relay nodes to all nodes in space connectivity graphs.

Figure 20. All nodes in space connectivity graphs.

Figure 21. Connected components in space connectivity graphs.

Figure 22. Average number of nodes per connected component in space connectivity graphs.

Figure 23. Average node degree of space connectivity graphs and minimum spanning forests.

Figure 24. Edges of space connectivity graphs and minimum spanning forests.

Figure 25. Cost of space connectivity graphs and spanning forests.

Figure 26. Space connectivity parameters in Polish cities at 50 m radio range.

Figure 27. Space connectivity parameters in Gdańsk.

Figure 28. Space connectivity parameters in Poznań.

Figure 29. Space connectivity parameters in Warsaw.

Figure 30. Space connectivity parameters in Wrocław.

Figure 31. All nodes in space–time connectivity and first-contact graphs.

Figure 32. Mobile relay nodes in space–time connectivity and first-contact graphs.

Figure 33. Instances per node in space–time connectivity graphs.

Figure 34. Connected components in first-contact graphs.

Figure 35. Nodes per connected component in first-contact graphs.

Figure 36. Average node degree in space–time first-contact graphs.

Table 1. Urban open data sources used in simulation study.

City	Class	Scope	Format	Updates	Provider
Gdańsk	Mobile advanced	Buses and trams [23]	JSON	20 s	Open Gdańsk
	Stationary simple	Public transport stops [24]	JSON	24 h	Open Gdańsk
	Stationary advanced	Ticket machines [25]	JSON	24 h	Open Gdańsk
Poznań	Mobile advanced	Buses and trams [26]	protobuf	Continuous	ZTM Poznań
	Stationary simple	Public transport stops [27]	JSON	Infrequent	Poznan City Hall
	Stationary advanced	Ticket machines [28]	JSON	Infrequent	Poznan City Hall
Warsaw	Mobile advanced	Buses and trams [29]	JSON	10 s	City of Warsaw
Warsaw	Stationary simple	Public transport stops [30]	JSON	Infrequent	City of Warsaw
Wrocław	Mobile advanced	Buses and trams [31]	JSON	Continuous	Open Data Wrocław
	Stationary simple	City bike rental stations [32]	JSON	5 min	Open Data Wrocław
	Stationary simple	Vozilla parking lots [33]	JSON	Continuous	Open Data Wrocław
All	Stationary advanced	Air quality meters [34]	JSON	Continuous	Airly

Table 2. Numbers of modeled graphs.

Category	Type	Number of Graphs
Space	Connectivity	14,400
	Minimum spanning forest	14,400
	Maximum spanning forest	14,400
Space–time	Connectivity	372
Space–time	First-contact	372
In total		43,944

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Musznicki, B.; Piechowiak, M.; Zwierzykowski, P. Modeling and Analyzing Urban Sensor Network Connectivity Based on Open Data. Sensors 2023, 23, 9559. https://doi.org/10.3390/s23239559

AMA Style

Musznicki B, Piechowiak M, Zwierzykowski P. Modeling and Analyzing Urban Sensor Network Connectivity Based on Open Data. Sensors. 2023; 23(23):9559. https://doi.org/10.3390/s23239559

Chicago/Turabian Style

Musznicki, Bartosz, Maciej Piechowiak, and Piotr Zwierzykowski. 2023. "Modeling and Analyzing Urban Sensor Network Connectivity Based on Open Data" Sensors 23, no. 23: 9559. https://doi.org/10.3390/s23239559

APA Style

Musznicki, B., Piechowiak, M., & Zwierzykowski, P. (2023). Modeling and Analyzing Urban Sensor Network Connectivity Based on Open Data. Sensors, 23(23), 9559. https://doi.org/10.3390/s23239559

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling and Analyzing Urban Sensor Network Connectivity Based on Open Data

Abstract

1. Introduction

2. Network Modeling Algorithms

2.1. Network Device Data to Slots of Space Nodes

2.2. Slots of Space Nodes to Space Connectivity List

2.3. Space Connectivity List to Space–Time Connectivity Graph

2.4. Space–Time Connectivity Graph to First-Contact Graph

3. Simulation and Analysis Methodology

3.1. Comparative Study Methodology

3.2. Statistical Analysis and Visualization

3.3. Simulation Data Sources and Node Classes

3.4. Simulation Areas and Example Modeled Networks

3.4.1. Gdańsk

3.4.2. Poznań

3.4.3. Warsaw

3.4.4. Wrocław

3.5. Simulation Architecture and Parameters

3.6. Simulation Study Metrics

4. Space Connectivity Analysis

4.1. Space Connectivity Nodes

4.2. Space Connectivity Edges

4.3. Space Connectivity Relationships

4.3.1. Space Connectivity in Gdańsk

4.3.2. Space Connectivity in Poznań

4.3.3. Space Connectivity in Warsaw

4.3.4. Space Connectivity in Wrocław

5. Space–Time Connectivity Analysis

6. Summary

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI