Optimizing Trajectory and Dynamic Data Ofﬂoading Using a UAV Access Platform

: The use of unmanned aerial vehicles (UAV) as an integrated sensing and communication platform is emerging for surveillance and tracking applications, especially in large infrastructure-deﬁcient environments. In this study, we develop a multi-UAV system to collect data dynamically in a resource-constrained context. The proposed approach consists of an access platform called Access UAV ( A _ UAV ) that stochastically coordinates the data collection from the Inspection-UAVs ( I _ UAVs ) equipped with a visual sensor to relay the same to the cloud. Our approach jointly considers the trajectory optimization of A _ UAV and the stability of the data queues at each UAV. In particular, the Distance and Access Latency Aware Trajectory (DLAT) optimization for A _ UAVs is developed, which generates a fair access schedule for I _ UAVs . Moreover, a Lyapunov-based online optimization ensures the system stability of the average queue backlogs for dynamic data collection while minimizing total system energy. Coordination between I _ UAV and A _ UAV is achieved through a message-based mechanism. The simulation results validate the performance of our proposed approach against several baselines under different parameter settings.


Introduction
Intelligent solutions utilising unmanned aerial vehicles (UAV) are emerging in various domains such as wireless sensing [1], payload delivery [2], precision agriculture [3] and search and rescue operations [4]. Moreover, with the current trend of automation, sensing, and information exchange within Industry 4.0 environments, UAV-based applications are also finding their place in construction, mining, agriculture and logistics industries, especially for resource tracking and operational monitoring using aerial imagery. UAV-based solutions are particularly beneficial in large infrastructure-deficient environments as they offer ease of deployment, quick access to ground-truth data, and higher reachability and coverage [5,6]. Furthermore, autonomous or semi-autonomous UAV-based solutions could facilitate many industry-specific audits such as progress monitoring, resource and safety inspections, environmental hazards, and many more. Although integrating a multi-UAV based visual sensing and monitoring system has many benefits, developing such a system is challenging. A few of those challenges include (1) monetary budget constraints which limit the number of deployed UAVs, (2) limited battery of UAVs restricting the observation span, (3) limited on-board processing capabilities of UAVs which necessitates online offloading of data, and (4) limited connectivity for data gathering and offloading tasks in infrastructure-deficient environments that requires efficient trajectory planning. The computationally intensive nature of the data collection task and limited on-board computation IoT 2022, 3 of UAVs impede the deployment of such solutions to collect and/or process data within large infrastructure-deficient sites, such as tracking the progress of complex construction projects. Hence, efficient multi-UAV mechanisms to collect and offload data from the field or points of interests (PoIs) to the backend system (cloud) are required to leverage the advantages of multi-UAV systems further. To address the above-mentioned challenges, a feasible solution could be the adoption of a heterogeneous multi-UAV framework.
UAVs can also be considered as integrated sensing and communication platforms for providing mobile edge computing services [7][8][9]. The edge-based UAV platform could be flexible and cost-effective for computational and data-offloading tasks in infrastructuredeficient environments. Furthermore, with the optimal trajectory design of an edge-based access platform, the overall coverage of the PoIs can be improved. Thus an Access UAV (A_U AV) platform can be optimized for data collection, basic data processing, and data offloading to the cloud within infrastructure-deficient environments. However, if the PoIs are arbitrarily distributed in 3D space such as the monitoring points in under-construction projects, the maneuverability of the UAVs becomes challenging. UAVs that can be more agile by design are therefore better suited to reach arbitrary locations quickly and safely. Furthermore, such agile UAVs can carry different sensors such as multi-spectral cameras, near-infrared cameras, etc. to collect the requisite data from the locations. Hence, instead of overloading the A_U AV with data collection and offloading tasks, it would be prudent to segregate the UAV functions to optimize them for specific tasks. In this paper, we introduce the Inspection UAVs (I_U AVs) that are morphologically different from the A_U AV, as they are smaller and more flexible to maneuver at lower heights. The I_U AV act as a mobile visual sensing platform to collect visual data from different locations in the environment. I_U AVs can be considered mobile sensors scattered throughout the environment. The A_U AV thus need to locate these dynamically moving I_U AVs in order to collect data from them.
Another problem is the buffer overflow of UAVs in data collection and offloading tasks. The limited on-board processing capability and shared bandwidth to transfer data leading to the overall system's instability. In addition, the varying data traffic and continuous movement of UAVs make it challenging to stabilize the system in a deterministic manner. Researchers have used online Lyapunov optimization [10] to address such system instabilities. Lyapunov optimization considers the stability of a system with time-varying data and optimizes the time averages of the system utility and queue backlogs.
In this study, we address the challenges of deploying a heterogeneous multi-UAV system comprising a single UAV access platform called the Access UAV (A_U AV) and multiple Inspection UAVs (I_U AVs) for dynamic data collection in infrastructure-deficient environments. We propose a Distance and Access Latency Aware Trajectory (DLAT) of the A_U AV by considering the fair access schedules of the dynamically moving I_U AVs. In addition, to address system instabilities due to queue backlogs, the proposed solution employs a Lyapunov-based online optimization approach. The overall problem has been formulated as minimizing the total energy consumption of the system. Furthermore, as the set of I_U AVs and A_U AV operate independently without any central entity, their coordination is ensured through a message-based estimation mechanism.
The contributions of this research are fourfold, which are summarised as follows: • An optimization model has been proposed for optimal trajectory planning of A_U AV to fairly access the dynamically moving I_U AVs in different time slots. The optimization model focuses on minimizing the distance travelled and generating a fair access schedule of I_U AVs.

•
From an applications point of view, this research focuses on trajectory optimization of the A_U AV to consider the limited battery constraint and the distance from the location of battery replenishment unit at the site. • Further, a model based on the Lyapunov based online optimization framework is proposed to minimize the energy consumption and the queue backlog of A_U AV and I_U AVs with limited storage capacity.
• Finally, extensive numerical experiments are conducted to evaluate the efficiency and performance of the proposed framework against multiple baselines.
The rest of the paper is organised as follows: Section 2 summarises the related literature on edge based UAV applications. Section 3 presents the proposed multi-UAV framework and the system model. The overall system objective is discussed in Section 4. Sections 5 and 6 discuss the access latency aware trajectory optimization and Lyapunov based system stability, respectively. Sections 7 and 8 discuss the experiments and results. Finally, Section 9 concludes the paper.

Related Work
Various studies have emphasized that trajectory planning of UAVs is an integral component of the UAV-based inspection and monitoring applications [11,12]. In [13], the authors presented the reconstruction of a 3D model and highlighted the importance of UAV trajectories for computer vision techniques to reconstruct the 3D structure accurately. In [14], the authors discussed how MEC can be divided into different architectures based on the role of UAVs, which could be users, computing entities, or data relay entities. The UAV-enabled MEC system is commonly employed in different scenarios to improve user experience and service availability or to increase the system's efficiency. The trajectory optimization of UAVs is an integral part of such MEC systems as it affects the energy consumption of the system and the service schedule of static or dynamic sensors. UAVs could be deployed to relay data further or provide partial computing to improve the system overall quality of service (QoS). In [15], multiple UAVs were deployed for data relay task from mobile devices to the BS. The overall objective was to minimize the energy consumption of mobile devices by jointly optimizing the task scheduling and UAV trajectories in resourceconstrained environment. Using a different approach, Ref. [16] proposed a single UAVmounted cloudlet to serve a set of mobile users. The overall framework minimizes the energy consumption of mobile users, while optimizing the trajectory of the UAV-mounted cloudlet. The work of Xu et al. [17] also considered the multi-UAV based computing framework to minimize the latency of mobile device data relay task either by on-board computing or relaying to BS. In [18], a hierarchical multi-coalition UAV MEC network was discussed where the resource-constrained UAVs could offload task to other UAVs with high computational resources to improve the overall system efficiency. However, the authors did not consider the queue optimization, dynamic access of UAVs and challenges of an infrastructure-deficient environment as modelled in our work. In [19], authors focused on minimizing the weighted sum of energy consumption of UAV enabled MEC system. They performed joint optimization of computation resource scheduling, bandwidth allocation to user equipment (UEs), and trajectory optimization of UAV-based edge servers with static ground sensors. The advantage of using multiple UAVs in the MEC system is further studied in the work of Diao et al. [20], where the effects of joint optimization of trajectories of multiple UAVs to improve the system metrics were considered. However, the dynamic evolution of the data queues of the UAV-based MEC system could alleviate the problem of queue stability and data offloading.
The authors in [21] addressed the stability issues with a Lyapunov-based joint resource optimization of bandwidth usage, processing power consumption, and transmission power. Zhang et al. [22] presented a complex system within a dynamic environment that involves joint optimization of the computation resources of the multiple mobile users, UAV-BS, and trajectory optimization. The authors in [23] discussed a UAV-assisted mobile edge computing framework that jointly addressed energy minimization, trajectory optimization, CPU frequency and offloading schedule. In [24], author considered the completion time of the task along with the energy minimization and trajectory optimization of a UAV. One significant difference between our work and those reviewed in the literature is the estimation of the location of dynamic sensors (i.e., I_U AV). This problem brings another challenge of coordination among I_U AVs and A_U AV in the absence of ubiquitous connectivity with a limited battery. The proposed solution in this study attempts to solve both problems.
The literature also discusses the network scheduling problem along with trajectory scheduling for UAV-based MEC. In [25], authors developed a hierarchical MEC system considering online optimization of computational resources and reinforcement learning based trajectory optimization of multiple UAV-BSs for collecting data from a set of static sensors. In [26], a sense and send transmission protocol was proposed using multiple UAVs in a cellular network using an iterative trajectory sensing and scheduling algorithm. However, this approach does not consider distributed and multi-layer interaction of UAVs to collect and offload data with limited connectivity. In [27], the authors employed reinforcement learning for sensing and sending information using a decentralized setup for multiple UAVs, however, their work did not consider the multi-layer UAV network with limited connectivity. As apparent from the literature, resource scheduling in multi-UAV based solutions is a challenging task, particularly in an infrastructure-deficient environment with limited connectivity. Therefore, the dynamic deployment of mobile UAVs either to collect data or relay data to the cloud could mitigate the issues of progress tracking and job monitoring in industrial settings and aid in the performance of project deliveries. This paper proposes a solution for end-to-end data offloading in large infrastructure-deficient environments using a hierarchical multi-UAV system.

System Model
This section presents the key components of the proposed multi-UAV framework. The system consists of heterogeneous UAVs, including a set of N Inspection UAVs (I_U AVs) and a single UAV Access Platform (A_U AV). I_U AVs are smaller in size and more agile. They collect visual data from a set of k Point of Interests (PoIs) denoted as l i . Because the framework considers infrastructure-less environments, limited Access Points (APs) available for cloud connectivity. Further, I_U AVs possess a limited connectivity range, making it difficult to transfer data directly to the cloud. A_U AV, which is larger in size and possesses higher computational capabilities, coordinates with the I_U AVs to collect data. We assume that the A_U AV always maintains a constant height, thus its trajectory lies in a horizontal plane. Figure 1 shows a high level overview of the system under consideration with I_U AVs tasked to collect data from the PoIs, whereas the A_U AV collects data from the dynamically moving I_U AVs and relay it to the cloud. The notations used in this study are given in Table 1.

Communication Channel
The communication between I_U AV and A_U AV (A2A channel) has a limited range and capacity. This work assumes that the achievable data transmission rate of the i th I_U AV in a given time slot as d o f f i (t). The communication channel between I_U AVs and A_U AV involves both line-of-sight (LoS) and non-line-of-sight (NLoS) links as PoIs can be distributed vertically and longitudinally. Furthermore, the shadowing effect is also considered due to obstructions caused by buildings and other structures in the surroundings [28,29]. The path loss of a link is given as follows: where X σ is a shadowing factor that is indirectly proportional to the altitude of the PoI, α ∈ {LoS, NLoS} and φ is the path loss exponent. The probability of LoS link, (P LoS ), depends on the angle of elevation and environmental constraints (e o and e 1 ) as given in Equation (2): The average path-loss is calculated as: In this work, we have assumed Wi-Fi technology without a fixed access point for emergency or infrastructure deficient scenarios [30]. The network of I_U AVs and A_U AV provides connectivity to send collected data from PoIs to the cloud.

Data Gathering Process
Each PoI (l j ) is a tuple (< d j , O j >), where d j specifies the amount of data (e.g., images) to be collected and O j denotes the 3D coordinates of the PoI. The sequence of PoIs to be visited is provided to the I_U AVs and the same is also shared with the A_U AV. During the traversal along the sequence of PoIs, if the buffer of any of the I_U AVs overflows then that I_U AV waits at the same PoI until its data is offloaded.
In order to gather and offload data, the A_U AV communicates with a single I_U AV in a time slot. Let us denote the data gathered by each of the I_U AVs in a time slot t by A i (t). Let Q i (t) be the queue of the i th I_U AV and d o f f i (t) denotes the amount of data offloaded to the A_U AV by the i th I_U AV in time slot t. The recursive equation to update the Q i (t) is as follows: Let L(t) be the queue of the A_U AV where A_U AV accepts the data from the selected I_U AV in the time slot t. The following equation updates L(t) recursively: where d o f f access (t) is the amount of data offloaded to the cloud by the A_U AV in time slot t. Figure 2 shows the different functions performed by an A_U AV in a single time-slot. The decision function takes negligible time to decide on the next I_U AV for data gathering, followed by the transition function where A_U AV takes τ trans time to move near the next possible location to connect with the chosen I_U AV. The search function (τ search ) estimates the location of the selected I_U AV based on the queue and position estimation algorithm given in Algorithm 1. The bound on the maximum time required to estimate the position of I_U AVs is discussed in Section 5.1. Finally, the data transmission function establishes the successful communication with the I_U AV (if it is not shadowed). The sequence of the functions mentioned above is repeated for every time slot. The next section describes the objective of the system and formulates it as an optimization problem.

System Objective
In the proposed framework, the offloading of data happens at two stages-(1) from I_U AV to A_U AV and (2) from A_U AV to the cloud. Our main focus is to achieve end-to-end data offloading to the cloud by minimizing the total energy consumption of the whole system (E sys ) given as: where E trans access (t) is the transition energy of the A_U AV, E comm access (t) is the transmission of the A_U AV, E hover access (t) is the hovering energy of A_U AV and E comm i (t) is the transmission energy of the i th I_U AV. The following subsections discusses the details of calculating each component of energy consumption in Equation (6).

Transition Energy of A_U AV
The transition energy of A_U AV refers to the energy consumed when moving from one location to another [11,16,31] which is given as: where κ is a constant that depends on the total mass of the A_U AV, vel(t) is the velocity of A_U AV and τ trans is the time taken to transit from one location to another.

Transmission Energy of A_U AV
A_U AV offloads data to the cloud via a wireless channel [32]. The transmission energy consumed to transmit the data to the cloud is given as: where τ comm is the time allotted for data transmission. Other parameters such as d  Table 1.

Hovering Energy of A_U AV
A_U AV hovers above the PoI to collect the data. The hovering energy consumed to collect the data is given as: E hover access (t) = P hover · τ hover (9) where, P hover is the power consumed while hovering per unit time and τ hover is the time for hovering.

Transmission Energy of I_U AVs
The energy consumed for offloading the d o f f i (t) data bits at time slot t from the selected I_U AV to the A_U AV using the Air to Air channel of bandwidth W Hz is given similarly to Equation (8) as: The wireless (Air to Air) channel power gain (ζ) from I_U AV to A_U AV can be given as: where g 0 is the path loss constant, r 0 is the reference distance, r is the distance between the UAVs, φ is the path loss exponent and τ is the time.
Given the system's energy consumption, our goal is to find the optimal settings to minimize the expected cumulative energy across the time horizon. The decision variables in every time slot t that affect the total system's energy are given by the set π(t) = {p i (t), P access (t), S access (t)} corresponding to the transmission energies of the I_UAVs & A_UAV, and the transition energy of A_UAV, respectively. Moreover, the channel information for the data offloading task is not deterministic and varies in the environment, hence the amount of data arrived at the A_U AV becomes stochastic which depends on the channel characteristics and the position of the selected I_U AV. Further, this framework does not consider the energy consumed for the movement of I_U AVs as the PoIs are predefined and the I_U AVs follow a predetermined trajectory consuming constant energy. The overall optimization model for the stable system performance is formulated as: s.t.
P access (t) ≤ P max , ∀t (14) ||S lim Constraints (13) and (14) define the maximum transmission power of I_U AVs and A_U AV, respectively. Constraint (15) limits the maximum transition energy of A_U AV for every transition and Constraint (16) limits the time that has elapsed since the last access of i th I_U AV to be less than R max . Additionally (17)- (19) bound the number of transmitted bits. Constraints (20) and (21) establish the rate stability of all the system queues (I_U AVs and A_U AV). Next, the model to optimize the trajectory of the A_U AV with respect to the trajectories of I_U AVs is discussed.

Distance and Latency Aware Trajectory (DLAT) Optimization
Flexible and dynamic trajectory planning of A_U AV is crucial to applications where terrestrial communication infrastructure is missing. As already mentioned, the position of I_U AVs changes in every time-slot since they move through different PoIs to collect data. The A_U AV's trajectory needs to be planned so that it can connect and access an I_U AV in a time-slot before the I_U AV's queue overflows. Whenever an I_U AV's queue gets full, it does not move to its next designated PoI. Instead, it sojourns at the same PoI until it can offload its data to the A_U AV and free up the queue space. In order to choose one of the I_U AVs to gather data, the A_U AV would require the real-time information about the queues of all I_U AVs in each time-slot. This information is not available a priori due to the dynamic nature of the system queues. We use a message passing based approach for estimating the queues of I_U AVs to make a selection. Further, the trajectory of the A_U AV must be optimized to consume minimal energy.
The trajectory optimization model of A_U AV optimizes the trade-off between the transition energy of A_U AV and the access latencies of all I_U AVs. In addition, this access latency based data offloading generates an access fair schedule for the I_U AVs to offload their data to the A_U AV. The access latency (R i (t)) of the i th I_U AV in the time-slot t is the difference between the time of its last access by the A_U AV and the current time-slot. The distance and latency aware trajectory optimization of A_U AV is formulated as: x i (t) ∈ {0, 1}, ∀i, ∀t (28) where the first constraint (23) signifies that the distance travelled within a time-slot is limited by the maximum velocity. Constraint (24) limits the time that has elapsed since the last access of i th I_U AV to be less than R max . The constraint in (25) selects the I_U AV which has data to offload whereas (26) enforces the selection of only one of the I_U AVs in a time-slot. The selected I_U AV transmission power should be bounded as given in (27).

Estimating Position and Queue Length
The exact position and queue length of I_U AVs is not known to the A_U AV a priori. The A_U AV maintains the last access statistics of each I_U AV using status messages. The track of status messages received over time helps in computing the position (l i ) and queue length (Q i (t)) of I_U AVs in a time-slot. The status message comprises of the remaining queue size at the time of access and the data to be collected at the current PoI. Moreover, the pre-computed trajectory of each I_U AV provides the set of PoIs to be visited by each I_U AV. Algorithm 1 describes the procedure to estimate the queue length of each I_U AV in every time-slot.

Estimation of Search Time Bound
A_U AV estimates the location of I_U AVs in each time slot using the last access statistics. It could search the set of candidate locations to locate the precise location of selected I_U AVs, which contributes to the search time. The bound on the search time depends on the data generation rate and the maximum buffer of I_U AVs as derived below. Lemma 1. The search time τ search to locate the exact location of I_U AV with max buffer size Q max is given as: where is the maximum distance between two consecutive PoIs in the possible set of locations to be searched and |ψ| is the number of candidate locations for I_U AV and data generation process at each PoI follows the normal distribution D ∼ N (µ, σ 2 ) Proof. The time taken to find the location of I_U AV depends on the travel distance to cover the candidate PoI locations as given in Equation (30).
By generality, |ψ min | ≥ |ψ max | (31) where ψ min = {l i , . . . , l i,min } is the set of locations visited when each location has minimum data D min to be collected whereas ψ max = {l i , . . . , l i,max } is the set of locations when maximum data D max is present at each location. As the memory of each I_U AV is bounded by Q max , it covers less number of locations for ψ max as shown in Equation (31). Similarly, the data collected in both the scenarios will be same as the maximum memory size is fixed.
The candidate locations are defined as the locations starting at l i,max and ending at l i,min . Intuitively, the number of candidate locations |ψ| = |ψ min | − |ψ max |.
From the above derivation, the locations in search trajectory are influenced by data rate and maximum limit of memory size for I_U AVs. The upper and lower limit of normally distributed data is given as D max = µ + σ and D min = µ − σ respectively. Thus Equation (30) can be written as To calculate the upper bound for Equation (33), is the distance between consecutive PoIs which could be calculated from the pre-calculated trajectory of I_U AVs based on shortest path.

Energy Aware Data Offloading
The model presented in P1 in Section 4 is a stochastic optimization problem as the arrival of data in the system queue is stochastic. Using the online Lyapunov optimization algorithm, we solve the stochastic optimization in P1 and jointly stabilize all queues by finding the optimal policy to access each I_U AV in each time-slot. The quadratic Lyapunov function, as given in Equation (36) associates a scalar measure to the queues of the system. Further, the stability of the system is maintained by a guaranteed mean rate stability of the evolving queues as given in Equations (34) and (35).
consists of all system queues at a time t and Z(.) is quadratic Lyapunov function of system queues. The Lyapunov drift corresponding to the above function is given as: The Lyapunov drift plus a penalty function is minimized to stabilize the queue backlog of the system is given as: where V is a positive constant that controls the trade-off between Lyapunov drift and the expected system energy. A high value of parameter V signifies more weight on minimizing the energy of the system at the cost of high queue backlog. Therefore, V acts as a trade-off parameter between system energy and queue backlog. An upper bound on Z(v(t)) can be derived as follows, (for details see [10]) where C is a deterministic constant. As a result, the upper bound of the drift plus penalty function becomes Hence, the original formulation P1 can be reduced to P3 which bounds the system's drift to keep it stable.

P3 min
As can be observed, the constraints in P3 is a subset of the constraints in P1. To further simplify the solution of the optimization formulation, we reformulate P3 as two separate sub-problems provided the positions of A_U AV and I_U AV are fixed in a given time slot t. The Lyapunov based online optimization is optimal for a stochastic system to derive the overall optimal solution [33].

Optimization of Transmission Energies of I_U AVs
The first sub-problem deals with the optimization of parameters related to the I_U AV.
The objective function in P 3.1 is a convex function. The first & second constraints are linear and the third constraint is upper bounded by a concave function. As a result, the stationary point of the objective function is found to be:

Optimization of Transmission Energy of A_U AV
The second sub-problem deals with the optimization of the A_U AV parameters for the amount of data offloaded to the cloud at time t. The updated optimization model is given as: The model P 3.2 has a convex optimization objective subject to convex constraints to solve for the optimal transmission power of the A_U AV. The stationary point of the optimization model is P access (t) = min{max{( L(t)W V − N 0 W ζ ), 0}, P max }. Thus, the derived stationary points of the optimization model using the Lyapunov optimization framework are calculated in every time-step to optimize the A_UAV trajectory and data-offloading tasks. The overall proposed solution approach is presented in Algorithm 2. Next, we discuss the experimentation setup for evaluating the proposed solution.

Algorithm 2 Proposed Solution Approach
Initialize: Trajectories of all I_U AV i and list of PoIs l i .
4. Select the i th I_U AV to collect data using P2 5.
Compute d o f f i (t) for i th I_U AV using P 3.1 to offload data to A_U AV 6. Update Q i (t) 7. t = t + 1

Experimentation
In this section, we present the simulation setup to validate the efficacy of our proposed Distance and Latency Aware Trajectory Optimization with Lyapunov based energy-aware data offloading followed by results and discussions. The simulation parameters are listed in Table 2. We have considered a 600 × 600 square meter area with PoIs spread along the region in disjoint clusters and at heights ranging from 70 to 80 m above the ground. All experiments are conducted for at least 30 times and the average of results are plotted. We sample 150 PoI locations uniformly randomly in three disjoint clusters. From a practical point of view of a multi-UAV system, we consider a system of three I_U AVs with one A_U AV in all the simulation experiments. Each I_U AV is assigned to a cluster where I_U AVs randomly choose a starting location within the cluster. The sequence of PoIs to be visited by each I_U AV is generated using the shortest path algorithm. Before proceeding to the next PoI, an I_U AV collects all the data (A i (t)) from that PoI. In the data collection process, an I_U AV may sojourn at the same PoI across multiple time-slots until all the data (A i (t)) of PoI is collected. For each PoI, the amount of data to be collected is modeled as a Gaussian distribution with a mean of 150 Kb and variance of 50 Kb. The A_U AV gets partial information about the data generated at each location so it could not accurately estimate the location of I_U AV in the next time slot; as a result, it has to search for candidate locations to access the selected I_U AVs as discussed in Section 5.1. The trade-off parameter V ranges from 10 to 10 10 . The length of each time slot(τ) is 25 s divided into different sub slots as shown in Figure 2. The selection of I_U AV is assumed to take negligible time whereas transition may take up to 20 s. The search and transmit function takes total of 5 s. The maximum transmission power for A_U AV and I_U AV are 5 W and 2 W, respectively [25]. The other simulation parameters are listed in Table 2.
In order to validate the performance of our proposed approach, we compared our proposed approach with a set of baseline approaches on two broad categories of optimization parameters viz. Trajectory planning and Data offloading. We consider the following baseline approaches: The proposed approaches are as follows: • Distance and Latency Aware Trajectory Optimization (DLAT): In this approach, the A_U AV selects the I_U AV based on the minimum distance, with maximum bits to offload and access latency constraint as given in the trajectory optimization problem. • Hybrid Approach for Trajectory Scheduling (HDLAT): In this approach, the A_U AV selects the I_U AV based on the minimum distance and access latency constraint as given in the trajectory optimization problem up to a certain threshold of battery, i.e., 75% of the total battery. Beyond the threshold, the scheduling algorithm switches to the DAT strategy (proposed approach). • Lyapunov Optimization for data offloading: In each time slot, the A_U AV and the I_U AV calculate the optimal value of transmission energy using the Lyapunov Optimization.

Results and Discussion
In this section, we discuss the comparative performances of our proposed approaches with baseline approaches. It can be observed in Figure 4a that the average buffer length of I_U AVs increases in line with the rise of V after hitting the inflection point. The value of V between 6 and 7 could maintain the queue buffer and consume less transmission energy for I_U AVs. Similarly, A_U AV has fewer data transmitted from the I_U AVs for higher values of V, which would decrease the total data collection or transmission further by A_U AV as shown in Figure 3c.

Hovering Energy of A_U AV
The plot in Figure 3d depicts the impact of trade off parameter V on the hovering energy of A_U AV. The total hovering energy starts increasing after the inflection point because A_U AV takes more time slots to collect the same amount of data from I_U AVs. As a result, the energy consumption of A_U AV significantly increases for DAT and HDLAT, whereas it remains constant for DLAT (total flying time is less). Similarly, Figure 3e shows the evolution of total transition energy with V. It is interesting to observe that for DAT baseline A_U AV stays in the field for a longer time. As a result, transition energy is higher compared to DLAT and HDLAT. However, the A_U AV transition energy starts decreasing after the inflexion point. Similarly, DAT and HDLAT consume less battery in every time-slot as both save on transition energy by selecting the nearest I_U AV. This allows A_U AV to stay longer in the field, which is illustrated in Figure 3f. The baseline DAT remains for a more extended time, whereas RR stays for the least number of time slots before running out of battery. The proposed approaches of DLAT and HDLAT lie between the extreme baselines for the different analyses conducted. This shows that the proposed approach has an optimised trade-off between energy saving along with the end-to-end data offloading from multiple I_U AVs.    Figure 3b shows the effect of trade off parameter V on the average access latency of the system. The trajectory of the A_U AV affects the access sequence and waiting times of the I_U AVs to offload their data to A_U AV. This can be observed in the proposed DLAT + Lyapunov, which has an upper bound on access latency throughout the system. Similarly, RR based baseline approach has an access latency of 2 time-slots whereas for both DAT and DAT + MAX baseline approaches, the average access latency is higher. The average access latency for DAT baseline approach becomes worse with increasing V. The same remains stable for DLAT in both scenarios. HDLAT, as per expectation, remains between the DLAT and DAT approaches. It could be related to the fact that an increase of V causes the A_U AV to spend more time slots to collect the data from I_U AVs. An increase in flying time of A_U AV is influenced by a decrease in transition and transmission power, which increases the average access latency of DAT and HDLAT approaches. In contrast, it remains constant for DLAT and RR because of latency constraints. Our proposed approaches lie within the extreme baselines and maintain the average access latency by saving on transmission energy and transition energy in HDLAT by switching from DLAT to DAT after 75 percent of the battery is consumed. The Average energy consumption of A_U AV includes transmission, transition, and hovering energy consumption. Similarly, the tradeoff between the average access latency and the average energy consumption can be observed as the average access latency of the system reduces, the average energy consumption increases. By the definition and from Figure 3a,b,d,e, this can be oberved that the RR baseline has the least average access latency as well as the highest energy consumption whereas, DAT has the highest average access latency and the least energy consumption. From Figure 3b, this can be observed that for HDLAT the average access latency of A_U AV is reduced by approximately 70% as compared to the greedy approach(DAT) and remains constant for DLAT. The RR baseline has the least average access latency, but the gap between HDLAT and RR is much lesser as compared to the gap between HDLAT and DAT.

Coverage of PoIs
The coverage of PoIs by the A_U AV can be defined as the number of PoI locations whose data has been offloaded to A_U AV by the I_U AVs. Figure 4b shows the effect of trade-off parameter V value on the number of PoIs covered in the system. It can be observed that A_U AV can serve more PoIs for both DLAT and HDLAT approaches than the RR baseline approach. The DAT-baseline approach serves relatively more PoIs than DLAT and HDLAT by saving on transition and transmission energy, but not maintaining low access latencies of I_U AVs. In Figure 4c, the effect of increasing the buffer size of I_U AVs on the PoIs is shown. It can be observed, the number of PoIs served increases with an increase in buffer size. In our proposed approach, the performance of DLAT and HDLAT is a tradeoff between two extreme baselines. DAT baseline approach covers more locations but at the cost of access latency, as shown in Figure 4d. In a similar manner to average access latency, the tradeoff between the PoIs coverage and the average energy consumption is also evident from Figures 3e and 4b. The approach with higher average energy consumption also has a reduced PoIs coverage. RR has the highest energy consumption and covers the least number of PoIs, whereas DAT is having the least energy consumption but covers the maximum number of PoIs. In the optimal processing zone of log(V) between 6 and 7, HDLAT covers more than double the number of PoIs as compared to RR while consuming less energy.

Conclusions
In this paper, we have introduced an online scheduling approach to collect data in resource-constrained and infrastructure-deficient environments. The paper presents a heterogeneous multi-UAV based framework for end-to-end data collection and offloading using a distance and latency-aware trajectory optimization. The Lyapunov optimization approach is used to ensure the system's stability in terms of expected system queue backlogs by breaking the system optimization problem into manageable sub-problems. The simulation results show that the access latency of our proposed (DLAT & HDLAT) approaches perform better than other baseline approaches. Moreover, the system parameter V analysis has shown a trade-off between the queue stability and the system utility. The paper also presents a detailed analysis of different forms of energy consumption of A_U AV. In the future, we plan to optimize the trajectories of I_U AVs jointly with that of the A_U AV. The energy optimization of I_U AVs could be analyzed while designing the optimal trajectory of both the I_U AVs and A_U AVs. Further, the effect of introducing more than one A_U AV could be studied in the overall optimization approach. Along with this, the use of multi-agent reinforcement learning could be explored to solve the problem by introducing the coordination among the A_U AVs and I_U AVs.