Managing Time-Sensitive IoT Applications via Dynamic Application Task Distribution and Adaptation

: The recent proliferation of the Internet of Things has led to the pervasion of networked IoT devices such as sensors, video cameras, mobile phones, and industrial machines. This has fueled the growth of Time-Sensitive IoT (TS-IoT) applications that must complete the tasks of (1) collecting sensor observations they need from appropriate IoT devices and (2) analyzing the data within application-speciﬁc time-bounds. If this is not achieved, the value of these applications and the results they produce depreciates. At present, TS-IoT applications are executed in a distributed IoT environment that consists of heterogeneous computing and networking resources. Due to the heterogeneous and volatile nature (e.g., unpredictable data rates and sudden disconnections) of the IoT environment, it has become a major challenge to ensure the time-bounds of TS-IoT applications. Many existing task management techniques (i.e., techniques that are used to manage the execution of IoT applications in distributed computing resources) that have been proposed to support TS-IoT applications to meet their time-bounds do not provide a sophisticated and complete solution to manage the TS-IoT applications in a manner in which their time-bounds are guaranteed. This paper proposes TIDA, a comprehensive platform for managing TS-IoT applications that includes a task management technique, called DTDA, which incorporates novel task sizing, distribution, and dynamic adaptation techniques. DTDA’s task sizing technique measures the computing resources required to complete each task of the TS-IoT application at hand in each available IoT device, edge computer (e.g., network gateways), and cloud virtual machine. DTDA’s task distribution technique distributes and executes the tasks of each TS-IoT application in a manner that their time-bound requirements are met. Finally, DTDA includes a task adaptation technique that dynamically adapts the distribution of tasks (i.e., redistributes TS-IoT application tasks) when it detects a potential application time-bound violation. The paper describes a proof-of-concept implementation of TIDA that uses Microsoft’s Orleans Actor Framework. Finally, the paper demonstrates that the DTDA task management technique of TIDA meets the time-bound requirements of TS-IoT applications by presenting an experimental evaluation involving real time-sensitive IoT applications from the smart city domain.


Introduction
The Internet of Things (IoT) is a novel expansion of the Internet that forms a network of heterogeneous physical objects such as sensors, video cameras, mobile phones, and industrial machines (all of which we refer to as IoT devices) that can communicate and exchange data with each other over the Internet [1][2][3].
In recent years, the quantity of data generated from IoT devices (we refer to this data as IoT data) has grown significantly and there has been a great deal of interest in extracting valuable insights from these data [4]. To accomplish this, IoT applications collect IoT data from appropriate IoT devices and produce high-value information by analyzing the collected IoT data.

1.
A novel task management technique called Dynamic Task Distribution and Adaptation (DTDA) that integrates the following techniques: • A task sizing technique that utilizes an unsupervised machine learning technique to extract the most important features from task sizing measurement data (i.e., computing and network resources required to complete each task of a timesensitive IoT application). • A greedy task distribution technique that distributes and executes the TS-IoT application in the IoT environment, which was presented in [3]. • A novel task adaptation technique that dynamically adapts and re-distributes TS-IoT application tasks when it predicts a time-bound violation (i.e., the TS-IoT application cannot meet its time-bound requirements).
Our earlier work [3] presented a greedy task distribution technique that is unable to dynamically adapt the task distribution in real-time to cope with the volatile changes in the IoT environment. Hence, the proposed DTDA technique presented in this paper aims to address this key limitation of the greedy task distribution technique.

2.
A complete implementation of the improved task sizing and novel task adaptation techniques using Microsoft's Orleans Actor framework.  3. A comprehensive experimental evaluation using a real-world smart city use case and related dataset that includes: (a) an experimental setup that forms four clusters of computing resources with heterogeneous system configurations to validate the DTDA task management technique; and (b) a comprehensive comparison with two current state-of-the-art task management techniques that shows how well the TIDA platform, which implements the above techniques, meets the time-bound requirements of TS-IoT applications.
The remainder of the paper is organized as follows. Section 2 presents the related work, Section 3 presents a motivating use case scenario, Section 4 describes the system model and problem formulation, Section 5 discusses the dynamic task distribution and adaptation technique, and Section 6 presents the design and implementation of the TIDA platform. Section 7 presents the experimental evaluation results and Section 8 concludes the paper and outlines potential future work.

Related Work
TS-IoT applications are executed in an IoT environment that connects a variety of IoT devices such as sensors, mobile phones, cameras, and industrial machines with each other and/or to the Internet. Intrinsically, this environment is highly distributed. Traditional distributed real-time systems consist of a set of computer resources interconnected by a real-time communication network [22]. Generally, distributed real-time systems include a controlled execution environment (i.e., an environment where applications are executed). In addition, distributed real-time systems found in power plants and factory control systems are not connected to the Internet. In these distributed real-time systems, attributes of the execution environment, such as latencies of networks, data generation rates, and available computing resources, can be determined in advance [23,24]. On the contrary, although the IoT environment is highly distributed, it has unique characteristics that impose additional challenges for the execution of TS-IoT applications in such a manner that their time-bound requirements are met compared to traditional distributed real-time systems [3,8,25]. We further discuss these additional challenges of the IoT environment in Section 3 via a motivation scenario, and in Section 4 by presenting a formal model for the IoT environment.
Many techniques can be found in the literature that have been developed for distributed real-time systems and aim to optimize the execution of real-time applications in terms of execution time [26], cost of execution [27], and energy consumption [28]. Furthermore, due to the controlled execution environment of the distributed real-time systems, these techniques largely consist of optimization-based scheduling techniques [22,[29][30][31] that can effectively determine which task needs to be executed, at what time it needs to be executed, and the resource that should be used for the execution. However, such optimization-based techniques are only effective in controlled and predictable execution environments. More specifically, in this paper, we focus on techniques for dynamically distributing and adapting the tasks of TS-IoT applications by determining the relevant communication delays involved and the needed computing resource capacities. However, adaptive scheduling techniques found in distributed real-time systems determine which tasks need to be executed, at which times, to meet the deadlines of the system. Therefore, the task scheduling techniques developed for distributed real-time systems cannot be used for TS-IoT application task distribution. Due to this reason and the unique characteristics of the IoT environment (such as heterogeneity and volatility) compared to traditional distributed real-time systems, techniques developed for distributed real-time systems cannot be effectively used with TS-IoT applications to meet their time-bound requirements. Therefore, many studies have been recently conducted to devise suitable techniques for facilitating TS-IoT applications to meet their time-bound requirements.
Meeting the time-bound requirements of TS-IoT applications is challenging due to the heterogeneous and volatile nature of the IoT environment and the time-bound requirements of such applications [8,32]. In one of our previous works [3], we proposed an approach for Remote Sens. 2021, 13, 4148 5 of 31 dealing with these challenges that involves distributing TS-IoT applications in a collection of interrelated tasks and selecting the appropriate IoT computing and network resources to execute the tasks of each application in such a manner that they collectively meet the application's time-bound requirements. To enable such task distribution, a task sizing technique was proposed for estimating the computing and network resources required by the tasks of TS-IoT applications. Related work in determining the most suitable IoT resources for computing IoT application tasks includes [13,14], which investigated (1) how to estimate the computing resources required by cloud-based IoT applications based on historical performance metrics, and (2) evaluated various techniques for achieving this via the Cloudsim simulator. Zeng et al. [15] presented a simulator called IOTSim to analyze IoT applications to understand how they would perform in cloud computing environments. In [16], the researchers proposed a technique for measuring the performance of computing resources when different IoT application tasks are executed, whereas Korala et al. [33] introduced a platform to experimentally evaluate the performance of TS-IoT applications. Alhamazani et al. [34] proposed another approach for measuring performances of IoT applications across multiple cloud environments. Souza et al. [35] proposed a novel framework to take measurements of IoT microservices-based applications deployed across edge and cloud environments.
Most related research in task distribution has considered this problem as an optimization problem and proposed various optimization techniques (such as linear programming, non-linear programming, and heuristic techniques) for this purpose. For example, Taneja et al. [36] proposed a technique for efficient distribution of application tasks across cloud and edge resources in a resource-aware manner. Skarlat et al. [37] proposed an optimization technique that generates a task execution plan for IoT applications. Hong et al. [17] introduced a technique for optimizing the scheduling of IoT application tasks in edge devices. Yousefpour et al. [18] formulated IoT application distribution as an Integer Non-Linear Problem (INLP). The authors then used INLP to minimize the cost of resource usage while satisfying the QoS requirements of the applications. The optimization techniques proposed by [12,19,20] determine appropriate computing resource selection for meeting the QoS requirements of IoT applications. Related computing frameworks and tools, such as those in [38][39][40][41][42], have employed similar techniques to manage the distribution of TS-IoT applications. Zhang et al. [43] proposed a recommender system for dealing with the heterogeneity of cloud computing resources.
The IoT is subject to uncertainties, including unpredictable IoT data, volatile and mobile computing resources, and overloaded resource-constrained computing resources, due to multitenancy [35]. Therefore, task execution plans may need to be adapted dynamically to deal with possible time-bound violations. To address this, many related studies have presented dynamic task redistribution techniques, which involve generating new task execution plans periodically and/or in instances where certain computing resources become overloaded. Yousefpour et al. [18] introduced a technique to dynamically adapt the task execution techniques by periodically releasing and deploying tasks when there are QoS requirement violations. Skarlat et al. [21] also proposed a technique to dynamically redistribute the tasks when certain resources are overloaded and/or disconnected from the IoT environment. Xu et al. [44] proposed a solution that can recommend appropriate virtual machines for IoT application workloads in edge-cloud environments using a tree-based machine learning algorithm to predict performance metrics for these IoT application workloads.
In summary, optimization-based techniques that have been developed for distributed real-time systems cannot be used with TS-IoT applications to enable them to meet their time-bound requirements due to the volatile nature of the IoT environment. Task sizing techniques found in the literature have relied on simulation tools [13,14] or include limited testbeds [16] for sizing tasks using estimations. These techniques cannot effectively estimate the resources needed by TS-IoT application tasks because they do not deal with the heterogeneity and dynamic nature of the IoT environment. Most of the task distribution  [17][18][19][20] in device task execution plans; furthermore, most of them do not consider task sizing and are expensive to compute. Existing dynamic task adaptation techniques [18,21] have mainly employed task redistribution techniques and they dynamically adapt the task execution plans in a reactive manner (i.e., after observing that the application has failed to meet the time-bound requirement or after observing the resources are overloaded), which is not an effective solution to guarantee the time-bound requirements. Due to these reasons, these techniques are not suitable for TS-IoT applications that have demanding time-bound requirements. On the contrary, TIDA includes a Dynamic Task Distribution and Adaptation (DTDA) task management technique that integrates: (1) a task sizing technique that measures the computing and network resources required by the tasks when they are executed in the IoT environment; (2) a greedy task distribution technique that uses the task sizing information to generate time-bound satisfying task execution plans to distribute tasks in the IoT environment; and (3) a dynamic task adaptation technique that utilizes a predictive machine learning model to accurately predict possible time-bound violations and make necessary adaptations to the task execution plans to ensure the time-bound requirements are met. Furthermore, TIDA was implemented by extending Microsoft Orleans and the greedy algorithm was evaluated using a real-world smart city application.

Smart City Passenger Counting Application-Motivating Scenario
Let us consider a smart city application that requires an accurate count of passengers for a public transport system in near real-time. The passenger count information is used by the transport service to improve planning and scheduling of buses, allocate buses or trains to meet the actual demand, and respond to unplanned incidents such as bus breakdowns and accidents. Figure 1 illustrates the motivating scenario, computing resources, and IoT data analysis tasks in this TS-IoT application. tribution techniques in the literature employ complex optimization techniques [1 device task execution plans; furthermore, most of them do not consider task siz are expensive to compute. Existing dynamic task adaptation techniques [18,2 mainly employed task redistribution techniques and they dynamically adapt the ecution plans in a reactive manner (i.e., after observing that the application has meet the time-bound requirement or after observing the resources are overloaded is not an effective solution to guarantee the time-bound requirements. Due to th sons, these techniques are not suitable for TS-IoT applications that have demandi bound requirements. On the contrary, TIDA includes a Dynamic Task Distribu Adaptation (DTDA) task management technique that integrates: (1) a task sizi nique that measures the computing and network resources required by the tas they are executed in the IoT environment; (2) a greedy task distribution techni uses the task sizing information to generate time-bound satisfying task execution distribute tasks in the IoT environment; and (3) a dynamic task adaptation techni utilizes a predictive machine learning model to accurately predict possible tim violations and make necessary adaptations to the task execution plans to ensure t bound requirements are met. Furthermore, TIDA was implemented by extend crosoft Orleans and the greedy algorithm was evaluated using a real-world sm application.

Smart City Passenger Counting Application-Motivating Scenario
Let us consider a smart city application that requires an accurate count of pa for a public transport system in near real-time. The passenger count information by the transport service to improve planning and scheduling of buses, allocate trains to meet the actual demand, and respond to unplanned incidents such as bu downs and accidents. Figure 1 illustrates the motivating scenario, computing re and IoT data analysis tasks in this TS-IoT application. To count passengers in this smart city environment we utilized the follow devices, edge computers, and cloud resources: To count passengers in this smart city environment we utilized the following IoT devices, edge computers, and cloud resources: Orbbec Persee IoT devices providing a combination of RGB and infrared cameras with a fully functioning onboard computer were mounted above the doors of each bus. We used these devices to count the passengers stepping in and out of each bus at Edge computers at bus stops and train stations. These edge computers act as gateways for IoT devices and connect to the cloud data center via the Internet. Furthermore, edge computers also include additional computing and storage resources that can be used for IoT data analysis.

3.
A cloud data center with virtually unlimited computing resources.
In this IoT environment, the IoT devices, edge computers, and the cloud are connected via different networks (e.g., NB-IoT, 4G, and broadband). The Orbbec Persee IoT devices incorporate Wi-Fi cards and, as a result, they can connect to the edge computer at each bus stop. In addition, these IoT devices can also be directly connected to the cloud via 4G during the entire bus journey. However, IoT devices can connect to edge computers only when they are near bus stops or train stations. The edge computers and the cloud data centers are connected via broadband Internet. To compute the occupancy of each bus and the total occupancy, this TS-IoT application must perform the following: (1) capture passenger data while passengers are stepping in and out of each bus; (2) analyze the collected RGB/infrared/depth data and recognize individual passengers; and (3) compute the occupancy of each bus at each bus stop and the entire transport network. This task may involve the following sub-tasks: (1) pre-processing the collected RGB/infrared/depth data; (2) classifying passengers as entering or exiting by applying classification techniques such as the Haar-cascade classifier (please note that, in this paper, we consider the classifier to be an already trained classifier; hence training the classifier is not considered to be an IoT data analysis task and is not discussed further in this paper); (3) calculating the total occupancy of the bus; (4) computing the total occupancy of all the busses in the transport network.
The IoT passenger count application has a variable timebound that is hard to meet, i.e., it fails to meet its time-bound requirement when any bus reaches the next bus stop before its occupancy information from the previous bus stop is counted. Meeting the timebound requirements in the IoT often depends on the appropriate selection of computing and networking resources for each TS-IoT application. In the passenger counting IoT application, although we perform the entire data analysis quickly in the cloud, this may involve a significant communication delay to collect all the passenger RGB/infrared/depth data. Offloading the collected passenger data to the edge computers and performing the data analysis in the edge computers is another option. However, we only have a limited time to transfer the passenger data to the edge computer and the computing resources in edge computers are more limited than in the cloud. Processing data in an IoT device itself is another option that is viable only if an IoT device has enough computing resources available for the tasks of the IoT application at hand. Therefore, to meet the time-bound requirements of this and any other TS-IoT application, we must determine the best possible distribution of the data analysis tasks that comprise the TS-IoT application from the perspective of providing enough computing resources and communication capacity, and compute the assigned analysis tasks in a manner such that the entire TS-IoT application meets its time-bound(s).
However, as discussed in Section 2, unlike distributed real-time systems, the IoT environment introduces additional challenges. The IoT environment consists of heterogeneous computing resources connected by multiple networks, which have different data transfer rates, latencies, and bandwidth values. Moreover, the IoT environment tends to be volatile due to (1) unpredictable IoT data generation rates; (2) uncertain availability of resources caused by mobility, connection issues, etc.; and (3) overloaded resources attributed to multitenancy [3,8,25]. Therefore, TS-IoT applications require adaptation of their task distribution during the runtime to continue to satisfy the application time-bounds. For example, the number of passengers onboard at each bus stop varies and cannot be predicted; thus, the passenger onboarding data fluctuates throughout the bus journey. Moreover, some comput- ing resources (e.g., the cloud virtual machines) that are processing the TS-IoT application tasks may disconnect from the IoT environment due to network issues. Additionally, some computing resources such as edge computers may become overloaded due to multiple applications competing for resources, in addition to simply not having enough available resources to handle the incoming variable IoT data. Hence, the computing and storage resources of the IoT are volatile. As a result of this, although we determine the best possible distribution of tasks for the TS-IoT application before deployment, it is a challenging task to meet the application time-bounds during the application runtime. Moreover, due to these reasons, existing optimization-based techniques devised for distributed real-time-systems are not suitable and are ineffective in supporting the time-bound requirements of the TS-IoT applications, because it is necessary to dynamically adapt the distribution of the tasks of TS-IoT applications to deal with possible time-bound violations caused by the volatile nature of the IoT environment.

System Model & Problem Formulation
Due to the trade-offs between IoT resources in the distributed IoT environment, it is necessary to generate a task execution plan (which meets the application's time-bounds requirement) by determining the relevant communication delays involved and the needed computing resource capacities for each task. A task execution plan is a mapping of the tasks of the TS-IoT application to the corresponding computing resources where the tasks are executed. To address this, first, we present a formal description of the resources in the IoT environment and the TS-IoT applications. Then, we formulate the task distribution problem as an optimization problem.

Resource Model
Computing resources (i.e., IoT devices, edge computers, and the cloud) and network resources in the distributed IoT environment form a graph G Res = (Comp_Res , Network_Res), where Comp_Res represents the distributed computing resources and Network_Res represents the network links between computing resources. A single computing resource of G Res can be denoted as cr i , where cr i ∈ Comp_Res and i ∈ 1 . . . m, m is the total number of computing resources in G Res . Each cr i has an attribute called Available_Res cr i , which is the amount of computing and storage resources available at cr i . Further, Available_Res cr i can be represented as a tuple of cpu i cr , ram i cr . Figure 2 shows the IoT environment graph of the resource model. A single network link of the represents the network resources of a network link between two computing resources, and . This can be denoted as ∈ _ , where and denote the corresponding indexes of the two computing re- A single network link of the G Res represents the network resources of a network link between two computing resources, cr i and cr j . This can be denoted as nr ij ∈ Network_Res, where i and j denote the corresponding indexes of the two computing resources that are connected via network link nr ij . Each nr ij has the following attribute: Available_Band nr ij is the amount of available bandwidth of the network resource link nr ij . Furthermore, Available_Band nr ij is captured by a tuple up ij , down ij where up ij is the amount of available upload bandwidth and down ij is the amount of available download bandwidth in nr ij .
As discussed in Section 3, due to the volatile nature of the IoT environment, Available_Res cr i of computing resources, the number of connected computing resources in the IoT environment, and Available_Band nr ij of the network links vary over time and the degree of change is hard to predict.

Application Model
A TS-IoT application is comprised of a set of (possibly inter-dependent) tasks that interact via data exchanges. A TS-IoT application can be represented as a directed acyclic graph (DAG), G App = (Tasks, Data f lows), where Tasks represents the tasks of the TS-IoT application and Data f lows represents the data flows between Tasks. Each TS-IoT application has a time-bound requirement and we denote it as TB App .
A single task of the G App can be denoted as t i , where t i ∈ Tasks and i ∈ 1 . . . n, where n is the total number of tasks in G App . Each task can be of two types: stateful tasks and stateless tasks. Stateful tasks require buffering a certain number of data items before processing them. We identify the number of data items required to buffer in a stateful task as the queue size and denote this as q t i . Stateless tasks do not require buffering of data items during their data processing; therefore, we consider q t i of stateless tasks to be 1. Furthermore, to identify whether a task is stateful, we denote the following binary attribute, is_state f ul t i : is_state f ul t i = 1 if the task t i is a stateful task and is_state f ul t i = 0 otherwise. With the current proposed model, we assume that the tasks run continuously; hence, we do not consider any loop variables (i.e., control variables) for this model at this stage.
Each t i , has the following attributes: Task_Res t i is the amount of computing resources required for the execution of t i ; Proc_time t i denotes the time taken to process the IoT data at a specific computing resource. This depends on the computing resource where the task is executed. A t i is also associated with two delays. We denote these as Start_Delay t i and Wait_Delay t i . The time taken to produce the first data item during IoT data processing is denoted by Start_Delay t i and the delay between producing data items is denoted as Wait_Delay t i . We assume that the aforementioned attributes can be obtained by measurements.
A single dataflow of G App represents the dataflow (i.e., data transfer) between the predecessor task t i and successor task t j , and this can be denoted as d ij , where d ij ∈ Data f lows. i and j denote the indexes of the corresponding tasks. In our model, we assume that data is transferred piece by piece. Each d ij has the following attributes: Data d ij is the size of a single data piece transferred through d ij ; the amount of time to send a single piece of data via a network link is denoted as Com_delay d ij . Figure 3 shows the TS-IoT application graph.
The above model is based on the following assumptions: • We assume that cloud data centers in the IoT environment have a large amount of computing and storage resources [8], whereas IoT devices and edge computers have limited computing and storage resources.

•
We assume the Task_Res t i can be obtained by measurements via executing the corresponding task on a reference computing resource.

•
We assume the Proc_time t i on a computing resource can be obtained by estimation based on previous measurements.

•
We assume the G Res is developed by considering the number of computing resources and their networks available in the IoT environment.
. and denote the indexes of the corresponding tasks. In our mode sume that data is transferred piece by piece. Each has the following at is the size of a single data piece transferred through ; the amount o send a single piece of data via a network link is denoted as _ .  The above model is based on the following assumptions:


We assume that cloud data centers in the IoT environment have a large am computing and storage resources [8], whereas IoT devices and edge comput limited computing and storage resources. We assume the _ can be obtained by measurements via executing responding task on a reference computing resource. We assume the _ on a computing resource can be obtained by es based on previous measurements. We assume the is developed by considering the number of compu sources and their networks available in the IoT environment.

Problem Formulation
Our objective is to generate an application-specific, time-bound satisfying cution plan for the IoT environment within the available resources. To realize need to generate a task execution plan in an IoT environment in a manner such end-to-end response time of the TS-IoT application is within the time-bound requ of the application. Furthermore, in this model, we consider TS-IoT application with multiple paths, and to capture this we consider the end-to-end response tim critical path in the graph. We define this critical path of the application graph a tasks and dataflows, forming a path, for which the end-to-end response time is m We refer to this end-to-end response time of the application as Total Application E Time and denote it as _ . Given this definition, we can formulate the f equation:

Problem Formulation
Our objective is to generate an application-specific, time-bound satisfying task execution plan for the IoT environment within the available resources. To realize this, we need to generate a task execution plan in an IoT environment in a manner such that the end-to-end response time of the TS-IoT application is within the time-bound requirement of the application. Furthermore, in this model, we consider TS-IoT application graphs with multiple paths, and to capture this we consider the end-to-end response time of the critical path in the graph. We define this critical path of the application graph as a set of tasks and dataflows, forming a path, for which the end-to-end response time is maximal. We refer to this end-to-end response time of the application as Total Application Execution Time and denote it as Total_Exec App . Given this definition, we can formulate the following equation: where Path_Exec_Time p is the end-to-end execution time along the path p and Paths is the total number of paths in G App . For any path p, we can calculate the Path_Exec_Time p as the summation of execution times (i.e., summation of data processing time at tasks and delays involved in bringing data to the task, buffering data at tasks, etc.) of each task that is in that path p. Given this definition, we obtain the following: where Y is the total number of tasks in the path p and Task_Exe t j is the execution time of the j th task in the path p of G App . Task_Exe t j can be calculated from the following: In Equation (3), Proc_time t j is the amount of time taken to process IoT data by t j . Com_delay d ij is the amount of time taken to transfer a single data item from the predecessor task t i to the task at hand t j via d ij . To capture the total Com_delay d ij , we multiply this with the queue size of t j , which we denote as q t j . Note we do not need to consider the maximum of Com_delay d ij because we apply this equation on a single path of the graph, and at the end, the critical path is chosen using Equation (1). We assume Com_delay d ij to be 0 if the two tasks (i.e., t i and t j ) are executed in the same computing resource. Start_delay t i is the time taken to produce the first data item by the predecessor task t i and Wait_delay t i is the delay between producing data items at the predecessor task t i . For stateful tasks to capture the total Wait_delay t i , this is multiplied by q t j (i.e., the queue size of the task t j ). Com_delay d ij can be calculated using the following: where Data d ij denotes the size of a single data piece that needs to be sent to t j from predecessor task t i via d ij that is placed on network link nr ij and Available_Band nr ij is the available bandwidth of the nr ij. Decision variables: We define the decision variables that form the task execution plan as follows: the first decision variable α cr i t j denotes whether a task t j is distributed on a computing resource cr i . The next decision variable γ nr i d ij denotes whether a dataflow d ij is placed on a network resource nr i .
Constraints: First, the task distribution of computing resources and dataflow placement on network link resources must not exceed the available resources of those corresponding computing and network resources. A task t j can be distributed in the computing resource cr i if Available_Res cr i is at least equal to or more than Task_Res t j of t j . We can formally denote it as follows: Each network link can only transfer data that is within its available bandwidth and we can formally denote it as follows: where Data d ij denotes the amount of data transfer between task t i and t j via network link is the binary variable denoting whether a dataflow d ij is placed on a network resource nr ij .
Regarding the second constraint, TS-IoT applications must satisfy their time-bound requirements. We can formally denote it as follows: Objective function: The objective of the task distribution problem is to devise a task execution plan in an IoT environment that yields the minimum application execution time while satisfying the time-bound and resource constraints. We formally denote it as follows: Minimize: Subject to: Equations (5)-(7). However, solving this problem via optimization techniques tends to be NP-hard, and to ensure the time-bound requirements of TS-IoT applications in a such volatile IoT environment we need to generate time-bound satisfying task execution plans at the rate of the change occurring in the IoT environment. However, existing optimization-based techniques devised for distributed systems are ineffective. Hence, we aim to solve this problem using a novel dynamic task distribution and adaptation technique described in the next section. Table 1 lists the definitions of all notations we used in the above system model and the problem formulation. The quantity of resources available at i th computing resource nr ij The network resources of a network link between two computing resources, cr i and cr j Available_Band nr ij The amount of available bandwidth of the network resource link nr ij G App Graph representing the TS-IoT application Tasks Tasks of the TS-IoT application Data f lows Dataflows between tasks TB App Time-bound requirement of the TS-IoT application t i i th task of the TS-IoT application n Total number of tasks in G App y Total number of different paths throughout the application G App is_state f ul t i Binary variable denoting whether a task is stateful or not q t i Queue size of the i th task Task_Res t i The amount of computing resources required for the execution of the task t i Proc_time t i Time taken to process the IoT data at a specific computing resource by task t i Start_Delay t i Time taken to produce the first data item during IoT data processing of task t i Wait_Delay t i The delay between producing data items of task t i d ij Dataflow between the predecessor tasks t i and successor task t j Data d ij Size of a single data piece transferred through d ij Com_delay d ij The amount of time to send a single piece of data via a network link d ij Total_Exec App Total application execution time Path_Exec_Time p End-to-end execution time along the path p of graph G App

Task Management Technique That Includes Task Sizing, Distribution and Adaptation Techniques
In this section, we present three novel techniques we devised for meeting TS-IoT application time-bounds as part of the task management technique. The task sizing technique is presented in Section 5.1 and its results are used by the task distribution technique that is discussed in Section 5.2. The task adaptation technique adapts the task distribution by redistributing or reconfiguring individual tasks, whenever a potential violation of the TS-IoT applications time-bound is detected.

Task Sizing Technique
The task sizing technique is used for: (1) measuring the computing and network resources needed for the execution of TS-IoT application tasks in the available IoT devices, edge computers, and the cloud; and (2) the execution times of the TS-IoT application tasks. Unlike the existing task sizing techniques found in the literature, instead of estimating the computing and network resources needed to complete the IoT data analysis of the tasks of TS-IoT applications and the execution times of tasks, the proposed task sizing technique executes the TS-IoT application tasks in available computing resources in the IoT environment, and measures the computing and network resources needed for the execution of TS-IoT application tasks and execution times of tasks in the available IoT devices, edge computers, and the cloud. This enables more realistic resource needs and execution times of TS-IoT application tasks to be gathered compared to estimations. The task sizing technique is executed whenever the application developer submits a new TS-IoT application to the TIDA platform or whenever there is a change in the underlying IoT environment, such as resources becoming disconnected and connected. In this task sizing technique, each task of the TS-IoT application is executed on every available unique IoT device, edge computer, and the cloud using sample IoT data submitted by the TS-IoT application developer. The computing and network resources needed for the execution of each TS-IoT application task in the IoT environment are measured and recorded. These resource measurements are then used by the task distribution technique discussed in Section 5.2, and for training the machine learning (ML) model that is used in the task adaptation technique, which is discussed in Section 5.3.
The task sizing measurement data mainly captures the resource utilization metrics (e.g., CPU utilization and memory utilization metrics) and execution times of the tasks when the tasks are executed on available computing resources. As shown in Table 2, the raw task sizing measurement data consisted of 12 features. Unlike earlier task sizing solutions, e.g., [3], this task sizing technique employs an unsupervised machine learning technique called Principal Component Analysis (PCA) [45] to extract the most important features from the 12 features of the task sizing measurement data and then store these extracted features in the database. Feature extraction is the process of constructing a new set of features that is more informative and non-redundant than the raw measured data. Feature extraction techniques such as PCA can find correlations among the features in the task sizing measurement data. For example, consider the two features, the average CPU usage and maximum CPU usage, from the task sizing measurement data. These two features could be closely correlated; hence, keeping both of these features will not yield any additional benefit. There may also be many other features that are highly correlated with each other. Therefore, using a feature extraction technique such as PCA, a summarized version (which is more informative and non-redundant) of the original features can be derived from a combination of the original features. After applying the PCA technique on the task sizing measurement data, we were able to derive five new features that better describe the task sizing measurement data. The extracted features are then used to train the machine learning model used in the task adaptation technique. In addition, during the execution of the application, as preformed in the task sizing, we periodically measure the resource utilization and execution times. This monitored data is then used to periodically update the task sizing measurements. This provides continuously updated task sizing measurements and allows this technique to deal with variations in the volume and velocity of IoT data that lead to unexpected demand for computing resources. Table 2 shows the features of task sizing measurement data.
More in-depth details of this task sizing technique can be found in [3]. Most of the task sizing techniques in the literature include limited testbed implementations or rely on simulation tools to size the TS-IoT application tasks (i.e., to estimate the amount of resource required for the execution of tasks and the execution time of task on each resource). However, none of the existing task sizing techniques can effectively deal with (1) variations in the volume and velocity of IoT data that are common in IoT, and (2) the computing resource heterogeneity in the real IoT environment. Unlike other existing task sizing techniques, the technique proposed here measures the actual (i.e., not simulated) computing and network resources required to complete each task of the TS-IoT application at hand in the available computing resources in the IoT environment. Therefore, it provides more accurate estimations of the resource required for each TS-IoT application task and the execution time of each task per unique IoT device, edge computer, and cloud computing resource available. Section 5.2 discusses the task distribution technique that uses the resource estimates (i.e., task sizing measurements) produced by the task sizing technique. The pseudocode of the task sizing technique is shown in Algorithm 1.

3:
Execute task and measure the computing and network resource usage and execution time 4: Return the measured data 5: end foreach 6: end foreach 7: end function

Task Distribution Technique
In this section, we propose a novel task distribution technique that follows a greedy heuristic approach to incrementally solve the task distribution problem and generate a time-bound satisfying task execution plan. As mentioned in Section 5.1, the greedy task distribution technique utilizes the task sizing measurements produced by the task sizing technique to construct a time-bound satisfying task execution plan for executing each TS-IoT application. The greedy task distribution technique can be re-executed as needed to construct different task execution plans in situations where certain computing resources are disconnected from or connected to the IoT environment after the greedy task distribution technique was previously executed for any TS-IoT application. The pseudocode of the proposed greedy task distribution technique is shown in Algorithm 2.
The technique takes the TaskList, ResourceList, MeasuredData, and TB App as inputs. Then, for each task in the TaskList, the technique finds an eligible (i.e., has enough capacity to fulfill the resources required by the task) computing resource, which yields the lowest execution time for that task from a sorted resources map. To construct the sorted resources map for the first task in the TaskList, the technique uses only the computing resources that are closer to the IoT data source. To find such resources the algorithm uses the GetResourcesCloserToDataSource () function. Therefore, the first task of the application is always assigned to a computing resource that is closer to the data source, provided it has enough resource capacity (lines 4-5 in Algorithm 2). Moreover, to construct the sorted resources map for tasks that have predecessor tasks, the algorithm retrieves the tuples of the corresponding task from the measurement table and constructs a sorted resources map using the data in the tuples. The map consists of the resources and the corresponding execution time measured for that task. Furthermore, the map is sorted based on the measured execution times and we consider that one computing resource can host multiple tasks if it has enough resource capacity (lines 6-8 in Algorithm 2).
Once the sorted resources map is created, the technique iterates through each item in the sorted resources map until it finds an eligible computing resource. When the technique identifies an eligible computing resource, it first assigns that resource to the corresponding task via updating the task distribution map, then updates the available resources of the selected resource, updates the Total_Exec App based on the estimated execution time, exits the while loop, and moves to the next task in the task list (lines10-19 in Algorithm 2). The technique iteratively determines eligible computing resources in a greedy manner (i.e., picks the resource that would yield the lowest execution time) for each task in the task list.
It should be noted that IoT can be volatile, because in the real world the IoT data volume/velocity and the available computing resources often vary. Although the task distribution technique can be executed several times to produce different task execution plans whenever computing resources are disconnected or connected, it is not capable of changing or adapting its task execution plan at hand to deal with possible time-bound violations caused by varying IoT data volumes and available computing resources. Therefore, the task execution plans constructed by the task distribution technique may not achieve the time-bound requirements of the TS-IoT application due to such volatility. To overcome this drawback, we propose a novel task adaptation technique, which is discussed in Section 5.3.

Task Adaptation Technique
The objective of the task adaptation technique is to dynamically adapt the task execution plan of the TS-IoT application to mitigate potential time-bound violations caused by unexpected changes in the IoT device observation volume and velocity, and computing resources becoming unavailable due to the volatility of the IoT environment. To achieve this, the task adaptation technique employs a variation of the XGBoost regression tree model [46] to periodically predict the application's total execution time assuming that the application continues to follow its current task execution plan. Next, the predicted total application execution is compared with the time-bound requirement of the TS-IoT application to assess whether the current task execution plan at hand can meet its timebound requirement. If the predicted total application execution time is earlier than or meets the time-bound requirement of the TS-IoT application, then the current task execution plan remains unchanged. If the predicted total application execution time is later than the time-bound of the TS-IoT application, then an alternative task execution plan that can guarantee the time-bound requirements is selected from a set of alternative task execution plans (the creation of alternative execution plans is discussed further in Section 5.4). More specifically, when selecting an alternative task execution plan, the XGBoost model is used to predict the total application execution times for each of the alternative task execution plans and pick the task execution plan that yields the lowest total application execution time. After selecting an alternative task execution plan, the tasks that were running will be stopped. Then, the TS-IoT application tasks will be redistributed according to this alternative task execution plan and start executing again.
The task adaptation technique continuously trains the XGBoost model in an online manner using the features extracted from the task sizing measurement data (as shown in Table 2 and discussed in Section 5.1). XGBoost is a tree-based ensemble machine learning algorithm, which was selected as the basis of the proposed task adaptation technique due to its fast convergence speed. Compared to the other performance prediction models, such as neural networks [47], decision tree-based models require less training data, less training time, and less parameter tuning. Moreover, the ensemble tree-based algorithms, which combine several decision trees, performed better when compared to single decision tree-based models [44].

Combining the Task Sizing, Distribution, and Adaptation Techniques to Meet TS-IoT Application Time-Bounds
In this section, we outline how the techniques we discussed in Sections 5.1-5.3 are combined in an integrated task management technique called the Dynamic Task Distribution and Adaptation (DTDA) technique. In summary, DTDA ensures that TS-IoT application meets their time-bounds as follows: Step 1: Size the TS-IoT application's tasks in the available resources (i.e., all available IoT devices, edge computers, and cloud virtual machines) using the task sizing technique (line 4 in Algorithm 3).
Step 2: Train the XGBoost model according to the task adaptation technique using the features extracted from the task sizing measurement data (lines 5-8 in Algorithm 3).
Step 3: Construct all the possible task execution plans for the submitted TS-IoT application. For this purpose, we need to create all the different possibilities in distributing the tasks of the TS-IoT application to the available computing resources. Therefore, we generate all the combinatorial possibilities for distributing tasks of a given TS-IoT application to the available computing resources in the IoT environment. For example, consider an instance where a given TS-IoT application is comprised of three tasks (e.g., t 1 , t 2 , and t 3 ) and an IoT environment that has two available computing resources (e.g., cr 1 and cr 2 ). In this step, all the different possible distributions of the tasks t 1 , t 2 , and t 3 in the two computing resources cr 1 and cr 2 will be generated. As we discussed in Section 4, each different task distribution possibility in the IoT environment is considered as a task execution plan; thus, we identify all the possible task execution plans for the submitted TS-IoT application. Pre-computing all the possible task execution plans permits the task adaptation technique to quickly pick an alternative task execution plan from the list of all possible plans that have been pre-computed. This enables the dynamic adaptation technique to quickly pick an alternative plan that will meet the time-bound of the application (line 7 in Algorithm 3).
Step 4: Generate a time-bound satisfying task execution plan (from this point on we refer to this as the current task execution plan) using the greedy task distribution technique, which was discussed in Section 5.2 (line 9 in Algorithm 3).
Step 5: Distribute the tasks into the IoT devices, edge computers, and/or cloud-based virtual machines according to the task execution plan and start executing the application tasks (line 10 in Algorithm 3).
Step 6: Periodically measure the resource usage and execution times of the tasks to update the task sizing measurement data by (1) extracting the features from them, and (2) using the extracted features to retrain the ML model (lines 13-18 in Algorithm 3).
Step 7: Periodically predict the total application execution time of the task execution plan at hand using the XGBoost model (lines 20-22 in Algorithm 3).
Step 8: Assess whether the task execution plan at hand can guarantee the time-bound requirement of the TS-IoT application using the predicted total application execution time (line 23 in Algorithm 3).
Step 9: If a possible time-bound violation is detected, select an alternative task execution plan from the set of pre-computed task execution plans (lines 24-34 in Algorithm 3).
Step 10: Redistribute the TS-IoT application tasks according to the alternative task execution plan and restart the application execution (line 35 in Algorithm 3).
The pseudocode of the proposed DTDA technique is shown in Algorithm 3.

TIDA Platform
To meet the time-bound requirements of TS-IoT applications we developed a novel time-sensitive IoT data analysis (TIDA) platform [48]. TIDA utilizes the combination of the task sizing, distribution, and adaptation techniques as described in Section 5 to distribute and execute the tasks of each TS-IoT application in the cloud, edge computers, and IoT devices. In this section, we discuss the design and implementation of the TIDA platform.

Architecture of the TIDA Platform
In this section, we discuss each component of the architecture. The architecture for the TIDA platform is illustrated in Figure 4.
IoT devices. In this section, we discuss the design and implementation of the TIDA pla form.

Architecture of the TIDA Platform
In this section, we discuss each component of the architecture. The architecture fo the TIDA platform is illustrated in Figure 4. Task Transformation Engine: To execute any TS-IoT application irrespective of it underlying application model, we propose a transformation technique that transform data analysis tasks of any TS-IoT application into a set of common executable units of th TIDA platform. We refer to these executable units as "actors". Each actor has the followin characteristics: (1) represents a data analysis task of the TS-IoT application, (2) function ally equivalent to its corresponding data analysis task, and (3) independent of any appl cation model. The transformation engine is responsible for this function of the platform and it takes any TS-IoT application specification as an input and transforms its data ana ysis tasks into a set of functionally equivalent actors that can be executed by the platform Task Sizing Engine: This engine is responsible for implementing the task sizing tech nique that involves measuring computing and network resources for the execution o tasks of the TS-IoT application.
Task Distribution Engine: This implements the task distribution technique and is re sponsible for efficiently managing the distribution of tasks. The task distribution engin is comprised of two modules. These are the distribution planner module and distributio invocator module. Distribution planners can accommodate different task distributio techniques. Task distribution techniques (such as the dynamic task distribution and ad aptation technique discussed in Section 5) generate task execution plans. Then, these tas execution plans are sent to the distribution invocator, which then distributes the tasks t the corresponding resources according to the plan and invokes their executions.
Monitoring Engine: This engine periodically monitors the execution landscape i terms of resource utilization (CPU usage and RAM usage) and execution progress of task It is comprised of a measuring module and a metric module. The measuring module i responsible for periodically collecting the resource utilization and task execution progres and forwarding this information to the metrics module. The metrics module then uses th information to compute numerous metrics, which is then forwarded to the reasoner en gine. The metrics [33] currently supported by the TIDA platform are as follows:  Task Transformation Engine: To execute any TS-IoT application irrespective of its underlying application model, we propose a transformation technique that transforms data analysis tasks of any TS-IoT application into a set of common executable units of the TIDA platform. We refer to these executable units as "actors". Each actor has the following characteristics: (1) represents a data analysis task of the TS-IoT application, (2) functionally equivalent to its corresponding data analysis task, and (3) independent of any application model. The transformation engine is responsible for this function of the platform, and it takes any TS-IoT application specification as an input and transforms its data analysis tasks into a set of functionally equivalent actors that can be executed by the platform.
Task Sizing Engine: This engine is responsible for implementing the task sizing technique that involves measuring computing and network resources for the execution of tasks of the TS-IoT application.
Task Distribution Engine: This implements the task distribution technique and is responsible for efficiently managing the distribution of tasks. The task distribution engine is comprised of two modules. These are the distribution planner module and distribution invocator module. Distribution planners can accommodate different task distribution techniques. Task distribution techniques (such as the dynamic task distribution and adaptation technique discussed in Section 5) generate task execution plans. Then, these task execution plans are sent to the distribution invocator, which then distributes the tasks to the corresponding resources according to the plan and invokes their executions.
Monitoring Engine: This engine periodically monitors the execution landscape in terms of resource utilization (CPU usage and RAM usage) and execution progress of tasks. It is comprised of a measuring module and a metric module. The measuring module is responsible for periodically collecting the resource utilization and task execution progress and forwarding this information to the metrics module. The metrics module then uses this information to compute numerous metrics, which is then forwarded to the reasoner engine. The metrics [33] currently supported by the TIDA platform are as follows: application's total execution time, the prediction engine can assess whether the tasks at hand can meet the time-bound requirements with the task execution plan at hand, as discussed in Section 5.3. If it cannot meet the time-bound requirements, the task adaptation engine is triggered to make the necessary adaptations to the task execution plans to deal with predicted future time-bound violations. Task Adaptation Engine: This engine implements the task adaptation technique and is responsible for making the necessary dynamic adaptations to the task execution plans whenever the prediction engine predicts a possible future time-bound violation. To achieve this, the task adaptation engine selects an alternative task execution plan that can guarantee the time-bound requirements from a set of pre-computed task execution plans. Finally, it triggers the task distribution engine to redistribute the tasks according to the alternative task execution plan.

Implementation of the TIDA Platform
A proof of concept implementation of the TIDA platform [48] was implemented using Microsoft's Orleans Actor framework [49]. Orleans actors are developed to scale elastically, and they can run on any operating system that has .NET core installed. Therefore, we decided to implement the TIDA platform's underlying executable units as Orleans actors. This enabled us to develop a highly scalable and efficient task management system that led to the development of a proof-of-concept task distribution engine. Furthermore, we implemented the discussed DTDA technique as part of the task distribution engine. The transformation engine was implemented as a .NET CORE class library. For the proof of concept implementation of this research, we developed a wrapper that can be used to read a workflow specification file modeled using the camunda [50] workflow modeler. The monitoring engine was implemented as an Orleans start-up service, which is activated when TIDA is up and running. The monitoring engine periodically monitors the execution landscape in terms of resource utilization and execution progress of tasks. Moreover, its metric module then computes the metrics that are mentioned in Section 6.1. The reasoning engine was implemented as a Flask Python service. The machine learning model was implemented using the Python XGBoost library. Other engines interact with the reasoner engine via REST API calls. The task adaptation engine was implemented as another .NET CORE class library. A PostgreSQL [51] relational database was used as our storage provider. This store measured data, feature data, application-specific data, and information of the resources such as the health of each resource.

Time-Sensitive IoT Application, Dataset and Experimental Setup
To evaluate the TIDA platform and its techniques, we implemented the passenger counting TS-IoT application discussed in Section 3. The application was modeled as a workflow application using the camunda workflow modeler. As we discussed in Section 3, the passenger counting TS-IoT application is comprised of three main data analysis tasks: (1) pre-processing task, (2) classification task, and (3) counting task. The pre-processing task is mainly responsible for pre-processing the passenger onboarding video data by converting them to gray-scale from RGB. The classification task uses a Haar-cascade machine learning classifier to classify passengers as entering or exiting, and to maintain counts for entering/exiting. The counting task simply aggregates the exit and enter counts received from the classification task to compute the total occupancy in the bus. We used the OpenCV library for pre-processing and classification tasks, and implemented the application using C#. Figure 5 shows the application graph of the passenger counting IoT application. the OpenCV library for pre-processing and classification tasks, and implemented the application using C#. Figure 5 shows the application graph of the passenger counting IoT application. The dataset used for experimental evaluation was obtained during a project carried out in Sydney, Australia [52], using a Orbbec Persee device, which provides internal computing and storage resources consisting of a Quad-core Cortex A17 processor (which has a processing speed of 1.8 GHz), 2 GB RAM, and 8 GB internal storage. We used three passenger onboarding video files with three different sizes (12.5, 25, and 37.5 MB) for the evaluations. Each of these video files has a resolution of 640 × 480 and 30 frames per second (FPS). Figure 6 shows a sample of the passenger onboarding data. The experiments were conducted in a cloud-based environment. For this purpose, we created a testbed in the cloud using the NECTAR research cloud [53]. The computing resources of the experimental testbed were categorized into four clusters of cloud virtual machines and each cluster consisted of three cloud virtual machines. To emulate edge computers and IoT devices, we created eight cloud virtual machines (four virtual machines that emulate edge computers and four virtual machines that emulate IoT devices) with similar system configurations of real-world edge computers and IoT devices. For this purpose, we considered the system configurations of the Cisco 807 industrial services router for edge computers and Orbbec Persee camera's system configurations for the IoT device. In summary, the experimental setup comprised the following computing resources: Cluster 1 consisted of three virtual machines that emulated IoT devices; Cluster 2 consisted of three virtual machines that emulated edge computers; Cluster 3 consisted of three cloud virtual machines; and Cluster 4 consisted of three virtual machines that emulated an IoT device, edge computer, and a cloud virtual machine. In this evaluation, we simulated the IoT data communication through the network because all the computing resources were created in the cloud. To compute the total data communication time, the sample video files and the pre-processed video files (i.e., the data output of the pre-processing task) were transferred to the cloud virtual machines over the Internet (bandwidth 4 Mbps) from a local computer and the amount of time taken to transmit was measured. Figure 7 shows the four computing resources clusters and the computing resources of each cluster with their corresponding system configurations. The dataset used for experimental evaluation was obtained during a project carried out in Sydney, Australia [52], using a Orbbec Persee device, which provides internal computing and storage resources consisting of a Quad-core Cortex A17 processor (which has a processing speed of 1.8 GHz), 2 GB RAM, and 8 GB internal storage. We used three passenger onboarding video files with three different sizes (12.5, 25, and 37.5 MB) for the evaluations. Each of these video files has a resolution of 640 × 480 and 30 frames per second (FPS). Figure 6 shows a sample of the passenger onboarding data.
the OpenCV library for pre-processing and classification tasks, and implemented the ap plication using C#. Figure 5 shows the application graph of the passenger counting Io application. The dataset used for experimental evaluation was obtained during a project carrie out in Sydney, Australia [52], using a Orbbec Persee device, which provides internal com puting and storage resources consisting of a Quad-core Cortex A17 processor (which ha a processing speed of 1.8 GHz), 2 GB RAM, and 8 GB internal storage. We used thre passenger onboarding video files with three different sizes (12.5, 25, and 37.5 MB) for th evaluations. Each of these video files has a resolution of 640 × 480 and 30 frames per sec ond (FPS). Figure 6 shows a sample of the passenger onboarding data. The experiments were conducted in a cloud-based environment. For this purpose we created a testbed in the cloud using the NECTAR research cloud [53]. The computin resources of the experimental testbed were categorized into four clusters of cloud virtua machines and each cluster consisted of three cloud virtual machines. To emulate edg computers and IoT devices, we created eight cloud virtual machines (four virtual ma chines that emulate edge computers and four virtual machines that emulate IoT devices with similar system configurations of real-world edge computers and IoT devices. For th purpose, we considered the system configurations of the Cisco 807 industrial service router for edge computers and Orbbec Persee camera's system configurations for the Io device. In summary, the experimental setup comprised the following computing re sources: Cluster 1 consisted of three virtual machines that emulated IoT devices; Cluste 2 consisted of three virtual machines that emulated edge computers; Cluster 3 consiste of three cloud virtual machines; and Cluster 4 consisted of three virtual machines tha emulated an IoT device, edge computer, and a cloud virtual machine. In this evaluation we simulated the IoT data communication through the network because all the computin resources were created in the cloud. To compute the total data communication time, th sample video files and the pre-processed video files (i.e., the data output of the pre-pro cessing task) were transferred to the cloud virtual machines over the Internet (bandwidt 4 Mbps) from a local computer and the amount of time taken to transmit was measured Figure 7 shows the four computing resources clusters and the computing resources o each cluster with their corresponding system configurations. The experiments were conducted in a cloud-based environment. For this purpose, we created a testbed in the cloud using the NECTAR research cloud [53]. The computing resources of the experimental testbed were categorized into four clusters of cloud virtual machines and each cluster consisted of three cloud virtual machines. To emulate edge computers and IoT devices, we created eight cloud virtual machines (four virtual machines that emulate edge computers and four virtual machines that emulate IoT devices) with similar system configurations of real-world edge computers and IoT devices. For this purpose, we considered the system configurations of the Cisco 807 industrial services router for edge computers and Orbbec Persee camera's system configurations for the IoT device. In summary, the experimental setup comprised the following computing resources: Cluster 1 consisted of three virtual machines that emulated IoT devices; Cluster 2 consisted of three virtual machines that emulated edge computers; Cluster 3 consisted of three cloud virtual machines; and Cluster 4 consisted of three virtual machines that emulated an IoT device, edge computer, and a cloud virtual machine. In this evaluation, we simulated the IoT data communication through the network because all the computing resources were created in the cloud. To compute the total data communication time, the sample video files and the pre-processed video files (i.e., the data output of the pre-processing task) were transferred to the cloud virtual machines over the Internet (bandwidth 4 Mbps) from a local computer and the amount of time taken to transmit was measured. Figure 7 shows the four computing resources clusters and the computing resources of each cluster with their corresponding system configurations. Table 3 illustrates the overall system configurations of the computing resources in the experimental setup.  Table 3 illustrates the overall system configurations of the computing resources in the experimental setup.

Experimental Evaluation Methodology and Evaluation Metrics
We compared the performance of the proposed task management technique (DTDA) with three state-of-the-art task management techniques, namely binpack [54], random [55], and greedy [3]. We implemented the abovementioned task management techniques in the TIDA platform to fairly compare the performance of the aforementioned techniques under different computing resource clusters and different quantities of data (passenger onboarding video size). Table 4 shows the task management techniques we used for this experimental evaluation.

Experimental Evaluation Methodology and Evaluation Metrics
We compared the performance of the proposed task management technique (DTDA) with three state-of-the-art task management techniques, namely binpack [54], random [55], and greedy [3]. We implemented the abovementioned task management techniques in the TIDA platform to fairly compare the performance of the aforementioned techniques under different computing resource clusters and different quantities of data (passenger onboarding video size). Table 4 shows the task management techniques we used for this experimental evaluation.

Task Management
Technique Description

Task Sizing Task Distribution Task Adaptation
Binpack technique [54] Binpack task management technique manages the TS-IoT application tasks by distributing them to computing resources that have the least available amount of CPU and memory. This minimizes the number of resources in use and is more cost-efficient. However, this technique does not size the tasks before the distribution and it is not capable of adapting its task distribution during the runtime.

No Yes No
Random technique [55] Random task management technique manages the TS-IoT application tasks by distributing tasks to computing resources randomly and executing them. However, this technique does not size the tasks before distribution and it is not capable of adapting its task distribution.

No
Yes No Greedy technique [3] Greedy technique sizes the tasks and then uses that task sizing information to generate a time-bound satisfying task execution plan and distribute the tasks according to that plan. However, it cannot adapt its task execution plans during runtime.

Yes Yes No
DTDA technique DTDA technique is the task management technique proposed in this paper and described in Section 5.

Yes Yes Yes
We continuously executed the passenger counting IoT application tasks for 1 h in each computing resources cluster (as discussed in Section 7.1) using the task management techniques presented in Table 4. During the execution, at regular intervals (i.e., every 2 min), we generated IoT data processing requests (i.e., a request sent to the TS-IoT application to process the IoT data generated by IoT devices along with the generated data) with varying passenger onboarding video data sizes to be processed by the passenger counting IoT application tasks. To vary the data sizes, we used the following passenger onboarding video files-12.5, 25.0, and 37.5 MB. Generating IoT data processing requests at regular intervals with varying data sizes enabled us to replicate how a real passenger counting IoT application deployed on buses processes different quantities of passenger onboarding data collected at bus stops, as discussed in Section 3. In this experimental evaluation, we considered the following two evaluation scenarios:

1.
Evaluate the impact of task management techniques on the total application execution time of the passenger counting IoT application.
The main objective of this evaluation scenario was to evaluate the impact of the task management techniques on the total application execution time (i.e., the end-to-end time taken to complete a single IoT data processing request) of the passenger counting IoT application when it is executed in different computing resources clusters with varying data sizes. To realize this, during the experimentation, for each IoT data processing request, we measured the data processing time and data communication delay incurred by the passenger counting IoT application for completing the IoT data processing request. We denote the data processing time as Data_Processing_Time App and data communication delay as Data_Comm_Delay App . Then, we used these two metrics to compute the total application execution (which is denoted as Total_Exe App ) for completing an IoT data processing request as follows: It should be noted that, in general, there can be more attributes that contribute to the Total_Exe App of a TS-IoT application. However, in this experiment, we considered Data_Processing_Time App and Data_Comm_Delay App as the two main attributes that make up the Total_Exe App . The results of this experimental evaluation scenario are presented in Section 7.3.1.

2.
Evaluate the impact of task management techniques on the time-bound violation of the passenger counting IoT application.
The main objective of this evaluation scenario was to assess how well the task management techniques supported the time-bound requirements of the passenger counting IoT application when it is executed in different computing resources clusters with varying data quantities. To realize this, we first set the time-bound requirement of the passenger counting IoT application to 60 s, which is the average time to travel between two bus stops. For each IoT data processing request, we then observed whether the passenger counting IoT application's Total_Exe App exceeds the time-bound requirement (i.e., 60 s). A time-bound violation happens when Total_Exe App of the passenger counting IoT application exceeds the time-bound requirement when processing an IoT data processing request. We used an evaluation metric called time-bound violation counts to capture the total number of time-bound violations that occur during the experimentation. We denote the time-bound violation count as V_Count. Moreover, to compare the impact of task management techniques on time-bound violations, we computed an evaluation metric called time-bound violation ratio, which we denote as V_Ratio. Time-bound violation ratio (V_Ratio) is the time-bound violation count over the total number of IoT data processing requests completed during the experimentation. We denote the total number of IoT data processing requests completed as Total_Requests and V_Ratio is computed as follows: The range for the V_Ratio is from 0 to 1, where 0 is the best-case scenario when each IoT data processing request is completed within the time-bound requirement of the TS-IoT application, whereas 1 is the worst-case scenario where the passenger counting IoT application has failed to meet the time-bound requirement for any IoT data processing request. The results of this experimental evaluation scenario are presented in Sections 7.3.2 and 7.3.3. Table 5 provides a summary of the evaluation metrics discussed in this section. As discussed earlier, for this experimental evaluation scenario, we considered the total application execution time as the summation of data processing time and the data communication delay of the passenger counting IoT application. For ease of comparison, we took the average values of the total application execution time, data processing time, and data communication delay because these metrics were measured/computed per IoT data processing request during the experimentations. Therefore, Figure 8 illustrates the comparison of the average total application time, average data processing time, and average data communication delay of the passenger counting IoT application under each task management technique when executed in four different computing resource clusters with varying data sizes. As discussed earlier, for this experimental evaluation scenario, we considered the total application execution time as the summation of data processing time and the data communication delay of the passenger counting IoT application. For ease of comparison, we took the average values of the total application execution time, data processing time, and data communication delay because these metrics were measured/computed per IoT data processing request during the experimentations. Therefore, Figure 8 illustrates the comparison of the average total application time, average data processing time, and average data communication delay of the passenger counting IoT application under each task management technique when executed in four different computing resource clusters with varying data sizes. By analyzing the results presented in Figure 8, we can see that as the size of the data increases the average total application execution time also increases. Results presented in Figure 8a show that executing the passenger counting IoT application in cluster 1 did not incur any data communication delay, but this led to higher average data processing time for each task management technique compared to that of other clusters. This is mainly because cluster 1 consisted entirely of IoT devices, which (1) have limited computing resources and (2) do not involve any communication delay. Furthermore, we note that the impact of the task management techniques on average total application execution time, average data processing time, and average data processing delay in cluster 1 varied inconsistently with the size of data. More specifically, the binpack technique yielded the highest average data processing time in cluster 1 for all the three data sizes, whereas the random, greedy, and DTDA techniques resulted in similar average total data processing time when the size of the data was 12.5 and 25 MB. However, when the size of the data increased to 37.5 MB, the DTDA technique yielded the second-highest average data processing time (71.41 s).

Experimental Evaluation Results and Discussion
By comparison, results presented in cluster 3 show that the average total application execution time, average data processing time, and average data communication delay of the passenger counting IoT application were not significantly impacted by the task management technique in use. In addition, we can observe that cluster 3 recorded the highest average data communication delay and lowest data processing time for each of the three data sizes compared to the other clusters. This is mainly because cluster 3 consisted of three cloud virtual machines, which (1) have higher computing resources and (2) involve high network delays.
Results presented in cluster 2 show that both greedy and DTDA techniques performed equally well for each of the three data sizes and reduced the average data processing time compared to the binpack and random techniques as follows: by 11.11% when data size was 12.5 and 25 MB, and by 9.43% data size was 37.5 MB.
Results presented in Figure 8d illustrate that the binpack technique yielded the highest average data processing time for each of the three data sizes in cluster 4. Noticeably, the binpack technique did not involve any data communication delay. As we discussed in Section 7.2, it manages the TS-IoT application tasks by distributing them to computing resources that have the least available CPU and memory. Therefore, it is evident that, in this case, the binpack technique distributed all the tasks of the passenger counting IoT application to the IoT device; thus, no communication delay is involved. By comparison, the random, greedy, and DTDA techniques recorded similar average total application execution times when the size of the data was 12.5 and 25.0 MB. However, when the size of the data increased to 35.0 MB, the DTDA technique improved the average data processing time compared to the greedy technique by 20%, the random technique by 36%, and the binpack technique by 53%.

Time-Bound Violation Ratio of Task Management Techniques under Different Computing Resources
The comparison of time-bound violation ratio results in Figure 9 shows that the timebound violation ratios for cluster 1 and cluster 3 are similar (i.e., 0.34) for each of the four task management techniques, with the exception of cluster 1, where binpack resulted in a 0.5 time-bound violation ratio. This is mainly due to the computing resources in cluster 1 and cluster 3. Cluster 1 utilizes three IoT devices, which have limited resources, whereas cluster 3 utilizes three cloud virtual machines, which have more resources compared to cluster 1. When a TS-IoT application is entirely executed in a resource-constrained environment, its tasks may overload the most constrained resources, causing a timebound violation. In addition, when a TS-IoT application is executed entirely on cloud virtual machines, the significant communication delays involved in sending IoT data to the cloud contribute to higher time-bound violation ratios. In these cases, the task management technique employed cannot prevent the time-bound violation of the passenger counting application.
However, we can note that the time-bound violation ratios significantly varied when the passenger counting application was executed in cluster 2 and cluster 4. The binpack technique manages TS-IoT applications by distributing the tasks of the application to computing resources that have the least available amount of CPU and memory. Although this minimizes the number of computing resources in use and is more cost-efficient, this leads to overloading of the computing resources that are processing the tasks. Thus, this can increase the data processing time of the tasks, which will lead to time-bound violations. As we can see from the results in clusters 2 and 4, the binpack technique recorded the highest time-bound violation ratios. More specifically, it recorded time-bound violations ratio of 0.34 and 0.67 in clusters 2 and 4, respectively. By comparison, the random technique manages the TS-IoT applications by randomly distributing the tasks of the application to computing resources. Therefore, the results obtained by executing the application under the random task management technique in clusters 2 and 4 show average results. In particular, this approach recorded a time-bound violation ratio of 0.34 in both clusters 2 and 4. The greedy technique generates time-bound satisfying task execution plans by considering the IoT environment, the time-bound requirements of the application, and the computing resource demands of the application tasks (obtained by the task sizing technique). Therefore, compared to the binpack and random task management techniques, the greedy technique recorded lower time-bound violations by distributing and executing the tasks according to a time-bound satisfying task execution plan that keeps the passenger counting application's total execution time within the application time-bound on most occasions. However, the greedy technique is not capable of adapting its task execution plans to cope with the dynamic changes occurring in the IoT environment, such as varying IoT data sizes and available computing resources; thus, when the application is executed in clusters 2 and 4 under the greedy technique, we still observe time-bound violation ratios of 0.25 and 0.23, respectively. On the contrary, the proposed DTDA technique of TIDA is capable of adapting the task execution plans of the running application to deal with possible time-bound violations; thus, we can observe that the DTDA technique recorded the lowest time-bound violation ratios for the passenger counting application when it was executed in clusters 2 and 4. More precisely, DTDA recorded a time-bound violation ratio of 0.08 in cluster 2 and 0.13 in cluster 4. Compared to the greedy technique [3], the novel DTDA technique presented in this paper improves the time-bound violation by 68% in cluster 2 and 42.8% in cluster 4.
Remote Sens. 2021, 13, 4148 2 technique employed cannot prevent the time-bound violation of the passenger co application. However, we can note that the time-bound violation ratios significantly varied the passenger counting application was executed in cluster 2 and cluster 4. The bi technique manages TS-IoT applications by distributing the tasks of the application t puting resources that have the least available amount of CPU and memory. Althoug minimizes the number of computing resources in use and is more cost-efficient, this to overloading of the computing resources that are processing the tasks. Thus, th increase the data processing time of the tasks, which will lead to time-bound viol As we can see from the results in clusters 2 and 4, the binpack technique record highest time-bound violation ratios. More specifically, it recorded time-bound viol  Figure 10 shows that when the size of the data increases, the time-bound violation ratio also increases. More specifically, we can observe that, in all clusters, when the size of the passenger onboarding video was 12.5 and 25.0 MB, random, greedy and DTDA techniques recorded zero time-bound violations. The comparison of time-bound violation ratio results in Figure 10 shows that when the size of the data increases, the time-bound violation ratio also increases. More specifically, we can observe that, in all clusters, when the size of the passenger onboarding video was 12.5 and 25.0 MB, random, greedy and DTDA techniques recorded zero time-bound violations. However, we can observe that, in clusters 1 and 4, the binpack technique recorded time-bound violation ratios of 0.50 and 1.00 when the size of the passenger onboarding video reached 25.0 MB. By comparison, we note that when the size of the passenger onboarding video increased to 37.5 MB, all the task management techniques failed to fulfil the time-bound requirements of the passenger counting IoT application in clusters 1 and 3. As discussed in the previous section, this is mainly due to the limited resources found in IoT devices and the high communication delays involved in sending the passenger onboarding data to the cloud. Figure 10b-d indicates that both binpack and random techniques were unable to support the time-bound requirements of the passenger counting IoT application when the size of the passenger onboarding data reached 37.5 MB. However, for the same size of the passenger onboarding data, the DTDA technique maintained the time-bound violation ratio within 0.25 in cluster 2 and 0.50 in cluster 4. These results demonstrate that, in cluster 2, the DTDA technique improved the time-bound violation ratio compared to the greedy technique by 67%, and compared to the random and binpack techniques by 75%. In cluster 4, the DTDA technique reduced the time-bound violation compared to the greedy technique by 33.33%, and compared to the random and binpack techniques by 50%. Finally, these results validate that the DTDA technique dynamically adapts the task execution plans to deal with the increased size of the passenger onboarding data, and this enables it to reduce the time-bound violations of the passenger counting IoT application compared to other task management techniques when the size of the data increases.
In summary, the evaluation results showed that when the passenger counting application is executed in cluster 1 (the cluster that consists of only IoT devices) and cluster 3 (the cluster that consists of only cloud virtual machines), regardless of the task management technique in use, on most occasions the passenger counting application failed to meet its time-bound requirement. The communication delay involved with sending data to the cloud and the resource-constrained nature of IoT devices may be the main reasons behind this behavior. Furthermore, the results showed that DTDA, which is our proposed task management technique of the TIDA platform, successfully fulfilled the time-bound requirements of the passenger counting IoT application with varying sizes of passenger onboarding data when the passenger counting IoT application was executed in cluster 3 (the cluster that consists of only edge computers) and cluster 4 (the cluster that consists of IoT devices, edge computers, and cloud virtual machines), whereas the three other task management techniques failed to meet the time-bound requirements. Moreover, the evaluation results showed that the DTDA technique improved the time-bound violation ratios by 42.84% and the data processing time by 20.42% compared to the greedy technique presented in [3] when the application was executed in cluster 4. Finally, the evaluation results demonstrated that the TIDA platform that includes the DTDA technique periodically predicts possible future time-bound violations and adapts its task execution plans to mitigate possible time-bound violations. This enabled the platform to meet the time-bound requirements of the passenger counting IoT application more often than the other task management techniques.

Conclusions and Future Work
In this paper, we extended our previous research presented in [3] and proposed a complete solution, called TIDA, for meeting the time-bound requirements of TS-IoT applications. The proposed TIDA platform uses a novel task management technique, called DTDA, which combines three novel techniques: task sizing, task distribution, and task adaptation. The task sizing technique measures the computing and network resources required to complete each TS-IoT application task in each available computing resource in the IoT environment, whereas the task distribution technique utilizes the measurements of task sizing to create time-bound satisfying task execution plans to distribute and execute TS-IoT application tasks in the IoT environment. Finally, the task adaptation technique utilizes a machine learning model to accurately and periodically predict possible timebound violations and, in the case of a time-bound violation, dynamically adapts the task distribution by redistributing tasks according to an alternative task execution plan.
We described a proof-of-concept implementation of the TIDA platform that implements the above techniques using Microsoft's Orleans framework. We evaluated the TIDA by developing a passenger counting IoT application, executing the application in a cloudbased testbed under four TS-IoT application management techniques, and assessing how well each of these techniques enables the application to meet its time-bound requirements. The results showed that the dynamic task distribution and adaptation (DTDA) technique of TIDA (discussed in Section 5), on average, improves the time-bound violation ratio by 43.34%, compared to the greedy technique, which is the base-line TS-task management technique of the TIDA platform. Moreover, the evaluation demonstrated the TIDA's ability to adapt to the varying volume of IoT data by dynamically adapting the task execution plans to deal with possible time-bound violations.
Although the current implementation of the DTDA technique is capable of dynamically adapting its task execution plans, during the adaptation step, the technique iterates through a set of pre-computed task execution plans to select an alternative time-bound satisfying plan, and then redistributes all the tasks into the computing resources according to the new plan. This step is not efficient and scalable because it may cause a significant delay when the number of tasks and the computing resources increase. Moreover, in this research, for the evaluation, we considered only one TS-IoT application. However, in the real world, there can be multiple IoT applications competing for the computing and networking resources in the IoT environment, and resources often connect and disconnect from the IoT environment. Therefore, in our future work, we aim to: (1) improve the DTDA technique to be more efficient and scalable; (2) conduct more experiments in a volatile IoT environment with multiple IoT applications; (3) compare the TIDA platform's ability to meet time-bound requirements with existing solutions; and (4) improve the existing machine learning technique by incorporating federated machine learning techniques [56]. Moreover, we plan to explore and incorporate (1) IoT services search and discovery techniques [57], (2) digital twin capabilities [58], and (3) contextualization [59] and approximation approaches [60] into the TIDA platform to make it a comprehensive IoT solution.