1. Introduction
With every passing year the animal habitat is shrinking, and new transport infrastructures (e.g., highways, railways, etc.) are built across densely forested areas to connect cities. Thus, wild animals often stray into human habitats, cross roads or rails, and sometimes results into deaths of humans as well as animals due to various reasons, including non-intentional, e.g., animal-vehicle collision. Apart from Asia and Africa, where animal-human conflict is common [
1], according to a statistic [
2] in Germany alone 194,410 animals were involved in animal-vehicle collision in the years 2015–2016. Even though the technological advancements in the computing and networking are today utilized to build smart cities, rails, etc., these systems have not been considered for animal-human cohabitation [
3]. Various sensors such as infrared cameras, optical fiber sensors, radars, etc. could be added to the smart hot zones near the roads/rails or adjoining areas (to human habitation) to detect the movement of animals and warn humans/vehicles in advance for a likelihood of animals forage. Although there are a few notable works that studied some aspects of the animal-human cohabitation problem [
1,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13], a warning system that could intelligently process data at various layers of networking architecture—from Internet-of-Things (IoT) devices, e.g., sensors, to fog-to-cloud—has not been integrated into the system. The processing of raw data near sensors can be helpful in some scenarios, where the energy consumption is an issue due to limited source of energy on the end devices. The reason that the unprocessed large sensed data takes longer transmission time than the processed relevant data, thus longer transmission could consume more energy of constrained end devices. However, the data processing at constrained end devices, e.g., Raspberry Pi, with limited compute and storage capability could result in more false alarms, longer processing time, and the processing task also consumes energy. Thus, an important aspect in a smart early warning system is to study a trade-off between the amount of data to be processed at the end devices and the data to be sent to upper layers for making inference (and generating alerts), such that consumed energy and end-to-end latency in transmission and processing tasks are minimum.
To take advantage of today’s smart systems to integrate early warning systems, we consider three recent trends: fog-to-cloud distributed computing, artificial intelligence methods, and fiber-wireless (FiWi) networks. With fog-to-cloud distributed computing, we can intelligently store, process and communicate among various layers of edge, fog and cloud [
14]. The ability of fog computing to process data locally is critical for an early warning system which is latency-sensitive. Artificial intelligence (AI) techniques, such as machine learning, would on the other hand help alleviate network congestion by processing raw data at AI-enabled edge devices and transmitting only useful information to fog nodes (e.g., base stations) using its inbuilt capabilities of learning, analysis and decision making [
3,
15]. For instance, a small computing edge device, e.g., Raspberry Pi, integrated with a global positioning system (GPS), a camera and other sensors could run customized AI algorithms to transfer only relevant data to the cloud, which could reduce the overall energy consumption in the data capturing, processing and transmission. Finally, we propose to integrate the concept of passive optical and wireless networks, known as fiber-wireless (FiWi) networks, to build an early warning system network. This is because passive optical network (PON) is economical not only to provide broadband services, but also an access to the next generation (5G) mobile networks.
In this paper, we propose and study an early warning system using the IoT-fog-to-cloud system comprising of IoT devices (cameras, sensors), base stations (BSs) and the cloud. In the proposed system, we first detect movements of animals at sensor nodes, then transfer sensor data to fog nodes (at BSs) over wireless sensor aggregation networks, and finally to the cloud for further processing over a passive optical network. The information processed is then distributed to humans (or vehicles) that are living (or passing by) near areas of animal habitats. Since an early warning system is a latency-critical application, the main question we try to answer is whether to transfer all data to higher layers for better inference (with respect to classification or alert accuracy) or to process data at edge devices before transferring to the fog nodes (e.g., base stations). At the same time, however, processing of data at its origin, i.e., edge devices (that would result in transferring lesser data) could result in false alarms being generated due to the limited store and computing power of edge devices. Thus, first we try to find the trade-off between communication cost and computation/processing cost through an experimental setup by showing the average amount of current drawn (or energy consumed) during image capturing, processing and transmission. Then, we show the trade-off between end-to-end latency (time between animal appearance to notification) and data volume transferred for each animal detection event in our distributed computing and AI-enabled early warning system through a simulation. Our experimental results show that data processing at end devices could save around 57% energy.
2. FiWi Networks
The dependency on cloud-based services is growing, and so is the amount of network traffic from IoT devices to the cloud, and vice versa. Thus, hybrid fiber-wireless networks present a viable solution to ensure high capacity, high flexibility and low-cost broadband access to cloud-based services [
19]. A typical architecture of a FiWi network is an integrated passive optical network and wireless networks, where each optical network unit (ONU) is connected to several wireless nodes (e.g., sensors, smartphones) through a wireless gateway (e.g., base station).
Figure 1 illustrates an envisioned FiWi architecture that integrates a passive optical network (PON) and wireless sensor networks (WSNs), and shows an example of network subsystems in all three layers: edge (sensors), fog (BSs, ONUs), and the cloud. As a main block of a FiWi network, a PON consists of an optical line terminal (OLT) at the service provider’s central office, a passive splitter and optical network units (ONUs). An OLT is a terminal equipment that includes a gateway router to connect PON to the fiber backbone or core networks. Additionally, its main function is to define transmission window and allocate bandwidth to ONUs which are near the end users. ONUs are expected to serve heterogeneous traffic of end-devices, since they would connect to wired, e.g., fiber-to-the-home (FTTH) subscribers, and wireless traffic through BSs in the 5th generation mobile networks. A single or a group of passive splitters allows to share a PON network among many end users by means of splitting power in Ethernet PON and/or demultiplexing wavelengths in wavelength-division multiplexing (WDM) PONs.
In this paper, we use an orthogonal frequency-division multiplexing (OFDM)-PON architecture, where each ONU processes only its assigned fraction of the OFDM spectrum generated by the OLT. This is because of its known better performance in terms of latency and synchronization compared to other PON technologies [
20]. Also, dynamic bandwidth allocation techniques used in the OFDM-PON fully exploit the spectrum flexibility. This motivates us to use the system such that we allocate subcarriers’ bandwidth based on the traffic volume of each class (cellular, FTTH and animal) within each ONU. At the same time, however, segregation of bandwidth requires the classification and prediction of traffic volumes, which in fact can be done through AI techniques. Various AI-based applications can be envisioned in
Figure 1, including the management of network devices and bandwidth allocation. For instance, instead of running a centralized bandwidth management of PON subsystems at the cloud, a central office (OLT) could dynamically allocate spectrum in PON subsystems based on the traffic predicted at its ONUs. This would help improve the bandwidth management and also latency.
In
Figure 1, wireless sensor nodes (edge devices) form an aggregation network along the road (e.g., to mimic a point of human-animal conflict) and they are wirelessly connected with the so-called sinks, denoted as base stations (BSs). Whenever a WSN node detects animals’ presence, it tries to send sensed data (e.g., picture/video) of its surroundings to a nearest BS. The energy aspect is also relevant for systems that are not connected with power grid and are, for example, either battery-based, or solar-powered. Another advantage of AI techniques is that they could also relieve the network congestion by avoiding the unnecessary data transfers, which also improves the end-to-end delay of those transfers that need to go over the network. On the other hand, a limited processing and storage resource pool at sensors would also have negative impact on the accuracy of classification and prediction. In contrast, a cloud-based infrastructure connected to the ONUs could deduce information from the animals’ raw data with more accuracy in classification and prediction. This makes it interesting to investigate the role and placement of AI-based methods in animal welfare and human-animal cohabitation systems based on FiWi.
3. The Role of Machine Learning
AI techniques, especially machine learning (ML), are mainly used in networking to make a distributed computing system intelligent enough to take decision on its own on various aspects, for example traffic engineering; changing device setting based on collected and processed data; how much data to transmit from edge devices to fog nodes and/or to the cloud, etc. Machine learning helps particularly in identification of animals’ movement using image classification algorithms, and predicting animals’ movement or traffic in general, correlate it with human movement and traffic, or classifying animals’ behavior. Additionally, based on the network data traffic predicted, a hybrid approach of processing and communication could be used. In other words, animals’ data could be processed at the source sensor (or end device) when the animal movement is predicted as more likely to happen. On the other hand, when animal movement is less likely to happen, during that period, most of the data collected could be transferred to fog nodes (e.g., server, mini-cloud) for more accurate processing, since a fog node with enough compute and storage capability can run heavyweight deep neural network models (e.g., ImageNet [
21]) that generally have more hidden layers and neurons than the lightweight models. At the same time, and as previously mentioned, attention needs to be paid to the accuracy of the classification and the prediction, which is a challenge.
Machine learning has also been successfully used in optical transmission systems and networks, as presented in a survey [
15]. In our OFDM-PON architecture, machine learning could also split optical resources flexibly based on predicted traffic demands of heterogeneous services. For example, in a PON, upstream flow scheduling at ONUs is more critical due to lower data rate (e.g., 2.5 Gb/s) as compared to downstream traffic at OLT with higher data rate (e.g., 10 Gb/s). Moreover, an ONU needs to allocate OFDM subcarriers in an OFDM-PON or time slot in an Ethernet-PON to various types of application demands based on their requirements (latency, bandwidth). As observed in the literature, peak and low of various traffic patterns could occur at different times during a day. Thus, learning-based traffic prediction could allocate just-enough bandwidth to data traffic demands [
22]. In 5G, for example, cellular connections would need to satisfy latency in the order of milliseconds. Therefore, a set of dedicated but flexible frequency bands and slots must be assigned for latency-sensitive applications. More importantly, a lightweight image classification model, such as MobileNet [
23], can be run on power and resource constrained end devices, and they could be helpful in identifying animals’ presence or absence, and other related information.
To address the issue of the said granularity of traffic patterns, we consider multi-class (three) traffic scenarios and use machine-learning-based traffic prediction to allocate bandwidth as per traffic demands of each class. Specifically, we use a long short-term memory (LSTM) network model—a recurrent neural network with LSTM units [
24], for the time-series traffic prediction, since it is capable of learning long-term dependencies. More importantly, it avoids vanishing gradients, which is a common problem in machine learning algorithms where the gradient of error functions decreases quickly without improving the learning process. To get a good prediction, a machine learning algorithm needs to be trained with many and different traffic patterns, and depending on the number of hidden layers, neurons per layer and training datasets, it predicts (also classify) the future events. In our approach, we first generated a time-varying mean and variance, and used lognormal distribution to generate a set of traffic data samples [
22]. The time-varying mean values (data) is obtained using the superposition of sinusoidal functions, as given by Equation (
1), where
is an amplitude constant,
and
are frequency-dependent constants, and
n is the number of frequency components (
).
We generated 43,200 traffic data samples per traffic class for training, validation and testing of the prediction model, out of which 67% is used for training, and remaining 33% for validation and testing of a LSTM network model, which consists of an input layer, a hidden layer with 4 LSTM blocks, and an output layer that makes a single value prediction. Each LSTM block has a memory cell for storing a value for long or short time periods and gates (input, output, and forget) for acting on signals (
x) they receive using sigmoid (shown in Equation (
2)) and
tanh activation functions. We used an Adam optimizer to automatically set learning rate and to update network weights.
In
Figure 2, a small fraction of real traffic data and predicted data during training and testing is shown against the indexes of traffic values to illustrate the prediction accuracy of a traffic class using the LSTM network model. We can see that the LSTM neural network-based prediction is very good, since root mean square errors during the training and the testing are very low (3.04 and 3.08 respectively). Thus, LSTM networks can learn various traffic patterns to predict the future traffic, which is the reason we propose to use it in allocating spectrum proportionately to different classes of traffic. Additionally, for animal identification on end devices (Raspberry Pi) connected with sensors, we use a MobileNet model [
23] which is proposed recently for image processing on resource and power constrained devices, such as smart phones. The MobileNet model utilizes convolutional neural networks (CNN), wherein depth-wise separable convolutions are used to build lightweight deep neural networks.
6. Performance Evaluation
In this section we evaluate the performance of an integrated IoT-fog-to-cloud early warning system by three performance metrics: average end-to-end per packet latency, notification time, and connection blocking. The average end-to-end packet latency is measured as the average time required to transmit a sensor data packet from an edge device to a BS, which mainly depends on the congestion in WSN networks. The notification time is measured as the average time difference between the animals’ detection and the alert message disbursed to the passing vehicles/humans. The connection blocking happens in the OFDM-PON when ONUs tries to reserve required bandwidth for a certain duration to transfer data of a traffic class.
The scenario we analyze using the Riverbed Modeler simulator consists of three end-devices and two routers per coordinator (BS), as shown in
Figure 6. End-devices act as sensor nodes, and routers act as relay as well as sensors, detecting the animal presence and capturing images of surroundings to deduce animal information, including animal position, movement direction, speed etc. Data is split into packets of 100 bytes payload (ZigBee), which are queued for transmission. The packets are consecutively forwarded to the coordinator acting as a sink (BS). The OFDM-PON as an optical access network is simulated with 16 ONUs. We assume that a WSN network with 5 sensors is connected to each base station with shortest routing path, and each base station is directly connected to an ONU. The WSN topology is an aggregation-based network, as shown earlier in
Figure 1, where the BSs act as sink nodes. Wireless link capacity is assumed to be 250 kilobits per second (Kbit/s) in ZigBee protocol without any packet retransmission. We assume fiber capacity as 4 THz in the OFDM-PON, which is divided among 320 spectrum slices, each with granularity of 12.5 GHz. Each spectrum slice could support a maximum of 25 Gb/s using a QPSK modulation format. All 16 ONUs share equal proportion of fiber upstream bandwidth, i.e., each ONU can reserve 20 spectrum slices.
We evaluate the performance of our animal-human cohabitation system by means of latency observed (i) with AI, that transfers the processed data from sensors to BSs (ii) without AI, that transfers raw sensed data, and (iii) the so-called hybrid approach that transfers most of the data during low traffic condition and only relevant information during busy hour.
We consider three classes of services: cellular or call (class-1), FTTH (class-2) and animal traffic (class-3).
Figure 7 illustrates the real and the predicted time-varying mean traffic arrival rates of these three classes. The simulation parameters are listed in
Table 1. It should be noted that mean traffic arrival rates vary with time
t, and the traffic per class is calculated as
, where
N is the number of class-
k traffic sources,
is the class-
k mean arrival rate at the source of traffic, and
is the mean holding (service) time of class-
k connections that reserves upstream bandwidth between ONUs and a central office. We assume traffic arrival for each class as a Poisson process with time-varying mean arrival rates shown in
Figure 7, where the mean arrival rate of a class-
k traffic is changed every hour. The mean holding times of class-1 and class-2 traffic are assumed to be exponentially distributed with mean 10 minutes as shown in
Table 1. For each animal appearance event at a sensor, its corresponding ONU tries to reserve a 250 Kbit/s bandwidth between the ONU and a central office. Therefore, the mean holding time required to transfer packets of each animal event at ONUs is kept as a constant (40 milliseconds, i.e., 0.00067 minute), since the data volume (processed) of each animal connection that needs to be transferred from an ONU to the central office is fixed as 1 KB, and the required bit rate is assumed to be 250 Kbit/s. The animal appearance events are generated at sensors (end devices), and the cellular and the FTTH arrival events are generated at each BS/ONU. The number of simulated arrival events for every hour varies with the time, since the arrival rate for each class changes after every hour. Furthermore, we are focused on upstream reservation, therefore, all results presented here are based on upstream reservation from edge devices to BSs to central office. We maintain a sufficiently large fixed size queue to store data packets of animal class at each ONU. Other latency-sensitive classes of traffic (cellular and FTTH) are either served immediately or blocked.
Figure 8 illustrates the average per packet delay (latency) between sensors and BSs of animal traffic under three data transmission scenarios: 1 KB, 100 KB and hybrid (i.e., a mix of 1 and 100 KB) using the ZigBee protocol. These scenarios also depict the case of the data processing at different layers: edge (sensors), fog (BS) and a hybrid approach, i.e., processing either at the edge or at BSs. We could also compare these scenarios with the cloud processing, where raw sensed data is sent to the cloud for the processing and decision making. However, we ignore this scenario, as we did not simulate the core networks and the cloud. However, it is obvious that the latency obtained in the cloud processing will be much higher than that of the other three scenarios presented here. Furthermore, we assume that the given specified data volume is transmitted for every animal detection event at sensors. In the first scenario, we assume that sensors are attached to a processing unit (e.g., R-Pi), thus they run a simple machine learning model (e.g., MobileNet) to detect and classify animals’ related data and transmit only 1 KB of data (around 10 data packets) for every detection, as we have also shown through an experimental setup in
Section 4. In the second scenario, end devices (sensors) leave the processing task to the fog layer (BSs) and transmit 100 KB data (around 1000 data packets) to BSs. The hybrid case, on the other hand, takes predicted animal traffic into the account to decide whether to process data at the edge layer (sensors) or at the fog layer (BSs). The decision threshold is shown by a black dotted line in
Figure 7c, and it could be set or even varied by monitoring the congestion or by predicted animal traffic. In the hybrid case, we enforce processing at sensor nodes (1 KB transmission) when the predicted arrival rate is higher than the threshold, otherwise the end devices transfer 100 KB data per animal detection.
From
Figure 8 we observe that when end devices process the raw data and transfer only 1 KB data to BSs per animal detection event, then the average packet latency varies between 26 to 30 milliseconds. Thus, in this case it can be deduced that WSNs are not congested (i.e., queuing delay is negligible). Therefore, we can compare it with the one-hop transmission in experimental setup in
Section 4, where the average per packet latency is obtained as 22 milliseconds. We observe that both results are comparable. However, it also depends on the distance between a transmitter and a receiver. Unlike the experimental setup, the maximum distance between end devices and routers/coordinator is kept in the simulator environment, so a small propagation delay also contributes to the latency obtained in the simulation environment. Furthermore, the effect of congestion on packet latency can be seen in the 100 KB transmission case, and the latency increases by two order of magnitude than that of the prior scenario (1 KB transmission). Interestingly, the average packet latency in the hybrid scenario can be reduced from a few seconds to milliseconds during the congestion hours of the day by applying an intelligent control mechanism using the traffic prediction.
Figure 9 shows the approximate average notification time, which is the average value of the time difference between the arrival of an animal detection event and an alert message disbursed to the passing vehicles/humans. The notification time is calculated as the sum of the data transmission time, the average packet latency and the processing time. We ignore the latency in disbursing alert messages from BSs to passing vehicles, since it is negligible as compared to other latencies and remains same in all three scenarios compared here. Furthermore, we assume that processing times at sensors and BSs are 5 s and 1 s, respectively. ZigBee transmits 100 bytes data per packet at a maximum rate of 250 Kbit/s, thus the total transmission times of 1KB (10 packets) and 100 KB (1000 packets) data are approximately 0.04 s and 4 s, respectively. As can be seen in
Figure 9, the notification time in the second scenario (i.e., processing at fog and 100 KB data transmission) varies from 6.5 s to 8 s as compare to the first scenario (processing at sensors and transferring just relevant information 1 KB), which is around 5 s. It should be noted that even though the processing time at the edge layer (5 s) is selected four times longer than that of the fog layer (1 s), it is beneficial to process data at the edge layer and transfer only relevant data to higher layers during the peak hours (dawn and dusk) as done in the hybrid processing, especially in WSNs with low power communication protocol (e.g., ZigBee), since the packet latency and the transmission time are the main contributors in the notification time.
The above observation would be further reinforced when we plot the notification time using the average data rate obtained through the experiments (i.e., 37 Kbit/s) with ZigBee transmission protocol, which is much lower than the maximum achievable data rate (250 Kbit/s). Therefore, at 37 Kbit/s data rate the total transmission times of 1 KB and 100 KB data are approximately 0.22 s and 22 s, respectively. Also, using the image processing done by the pre-trained MobileNet model, the processing time of an image with size 100 KB at an end device such as Raspberry Pi is around 7 s. Let us assume that the processing time at fog nodes (BSs) remains as 1 s. The notification times for all three scenarios have been depicted in
Figure 10. We conclude that it is essential to process data at end devices for most of the time when transmission time is the dominating factor in order to obtain lower notification time, lower energy consumption and to reduce animal-vehicle collisions.
Finally,
Figure 11 illustrates the normalized fraction of dropped requests due to congestion of cellular and FTTH traffic with and without machine learning-based traffic prediction and bandwidth allocation, i.e., dynamic bandwidth allocation (DBA) and static bandwidth allocation (SBA), respectively, in the OFDM-PON. Without machine learning (i.e., in SBA), the bandwidth allocated to each ONU is statically divided among the three classes of traffic based on the proportion of their long-term average traffic per day. With machine learning (i.e., in DBA), the system divides bandwidth based on the predicted traffic for every hour, thus it exploits the dynamic assignment of subcarriers. Furthermore, the animal traffic is very low as compare to the cellular and the FTTH, thus the animal (class-3) packets are stored in enough memory when they could not be sent to the central office immediately. Thus, there is no dropped packets for the class-3 traffic at ONUs/BSs. The normalized fraction of dropped requests of a class in an hour is calculated as the ratio of blocked requests in an hour and the maximum requests generated per hour during a day.
Figure 11 shows that the fraction of dropped connections under the SBA policy at any time for a class (FTTH or cellular) is proportional to the traffic (or mean arrival rate) of that class. On the other hand, we also see that AI-enabled dynamic management of optical resources, i.e., DBA, decreases the fraction of dropped connections during the peak hours for both traffic classes by assigning proportional bandwidth required for the traffic classes, thus effectively utilizing the statistical multiplexing gain. More importantly, the DBA gain for one class (FTTH) could come at the cost of increasing the fraction of dropped connections for another class (cellular) during a time interval, and it depends on the accuracy of predicted traffic. This can be seen in the first 12 h of a day in
Figure 11, where the fraction of dropped connections of FTTH (cellular) traffic is decreased (increased) in the DBA scheme as compare to the SBA scheme. The reason is that the DBA allocates (reserves) proportionately higher ONUs’ upstream bandwidth for the FTTH traffic in the first half of the day than the cellular traffic, since the predicted FTTH traffic is higher than the predicted cellular traffic in the first 12 h (see
Figure 7).