Joint Radar-Communication Optimization of Distributed Airborne Radar for AOA Localization

: Compared to the distributed ground-based radar (DGBR), the distributed airborne radar (DAR) has been widely applied due to its stronger anti-damage ability, more degrees of freedom, and better detection view of targets. However, unlike DGBR, the premise for the normal operation of DAR is to maintain stable wireless communication between unmanned aerial vehicles (UAVs). This requires each UAV to make reasonable use of its electromagnetic domain resources. That is, to maximize radar detection performance while ensuring communication performance constraints. However, current research in the ﬁeld of radar resource allocation has not taken this into account, which greatly limits the practical application of optimization algorithms. Moreover, the current research tends to adopt centralized optimization algorithms. When the baseline of the UAV swarm is long, applying multi-relay methods directly results in heavy communications overhead and long-time delay. Based on the above background, this article aimed to develop a fully distributed algorithm for the joint optimization of radar detection performance and communication transmission performance. This study ﬁrst took the measurement angle of arrival (AOA) as an example to provide a system model with communication constraints. This model considers the impact of factors such as the UAV location error, UAV communication coverage, and dynamic communication topology of the UAV on joint optimization. A formal representation of the joint optimization is presented. Then, we proposed a joint radar-communication optimization (JRCO) algorithm to fully utilize the electromagnetic domain resources of each UAV. Finally, numerical simulations veriﬁed the effectiveness of the proposed JRCO algorithm to traditional radar resource allocation methods.


Introduction
In recent years, radars have faced increasingly complex detection environments with the continuous development of target stealth and electronic warfare technology [1][2][3]. Among them, the development of target stealth technology has made the radar cross section (RCS) of targets continuously smaller, making it difficult for traditional radar to detect these targets [4,5]. Meanwhile, the development of electronic warfare technology has led to a rapid increase in the number of various suppression and deception jamming devices [6,7]. This has generated an increasingly crowded spectrum environment faced by traditional radars. Radar must choose appropriate frequency bands to smoothly carry out detection tasks. In this context, the detection ability of traditional single ground-based radar (GBR) further decreases for low-altitude flying targets with a slow velocity [8]. In order to address the aforementioned challenges, the DGBR radar is developing rapidly. Compared to a single radar, DGBR cannot only detect targets from multiple views but also effectively integrate the radars of different frequency bands, polarization modes, and signal bandwidths, achieving a stronger detection performance than a simple combination of single radars [9][10][11]. However, due to the fact that large GBR is always fixed on the ground, the detection view of low-altitude targets is limited. Moreover, due to the fixed location of GBR (i.e., optimal location deployment cannot be achieved), target detection performance is usually limited. In order to overcome these shortcomings, traditional DGBR usually applies methods such as increasing transmission power [12], increasing antenna aperture [13], and increasing signal bandwidth to enhance detection capabilities [14]. However, these methods often lead to a rapid increase in the volume of radar equipment, resulting in the further deterioration of radar maneuverability. Meanwhile, it can lead to serious problems, such as a rapid increase in the overall manufacturing costs of the radar system and difficulties in later maintenance [15].
DAR is a revolutionary technology aimed at traditional DBGR [16][17][18], which is usually of the following advantages: 1. DAR is of more degrees of freedom. DAR can achieve the flexible deployment of radar by flexibly adjusting the locations of the UAVs [19]. This not only enables the more effective detection of targets from multiple angles but also effectively avoids external jamming and attacks [20]. For example, when some nodes in the network are paralyzed by the external jamming of an attack, other UAVs in the network can automatically continue to perform tasks, which greatly improves the robustness of radar networking. 2. DAR is of a better detection view on target. Compared to large DGBRs, DAR typically performs tasks at higher altitudes. High platforms can overcome the weakness of a limited line-of-sight (LOS) of DGBR, making air detection less susceptible to interference from ground clutter and increasing the probability of target detection [21]. 3. DAR is of stronger scalability. Large DGBR can usually only achieve overall system scalability by enhancing the performance (power, antenna aperture, etc.) of each radar in the network, which is easily constrained by the hardware conditions of the radar itself. DAR can flexibly increase or reduce the number of UAVs according to the task requirements [22], making it easier to achieve scalability.
Since DAR is of the above advantages, it has been widely applied in target detection [23], localization [24], tracking [25], recognition [26], imaging [27], and other aspects. The premise for DAR to effectively leverage the above advantages in these tasks is that the radar system can effectively and reasonably allocate its internal resources based on the task requirements. In general, the problem of target localization (or tracking) can be analytically characterized by deriving the Cramer-Rao lower bound (CRLB) for target location estimation, thereby facilitating the construction of an objective function for optimization. At present, most of the current research on distributed radar resource allocation focuses on the joint optimization of multiple parameters within a radar system in the context of localization (or tracking). For example, Zheng Nan'e et al. proposed a joint resource allocation scheme involving the sensor subset, power, and bandwidth. By deriving Bayesian CRLB as the objective function for optimization, the power and bandwidth allocation were both optimized using the convex approximation method of sequence parameters [28]. Shi Chenguang et al. jointly optimized the flight velocity, heading angle, radiation power, signal bandwidth, etc. The proposed method was compared with the following four algorithms: Fixed Path Planning and Optimal Transmit Resource Scheduling, Cooperative Online Path Planning and Transmit Parameter Optimization, Cooperative Online Path Planning and Waveform Parameter Selection, and Online Path Planning and Fixed Transmit Resource Scheduling. The simulation results showed that the algorithm proposed in this article could effectively improve target tracking accuracy while meeting the radio frequency (RF) stealth performance of various airborne radars [29]. In the context of distributed radar localization, Weiwei Zhang et al. jointly optimized the power allocation, bandwidth allocation, and radar node selection in the context of distributed radar localization [30]. Haowei Zhang et al. proposed a joint subarray selection and power allocation (JSSPA) policy for large-scale distributed MIMO radar networks when tracking multiple targets in cluttered environments. Based on the information of the recursive tracking method, optimal resource allocation could be achieved, aiming to improve tracking accuracy [31].
From the above research status, it can be concluded that the current research on distributed radar resource allocation is usually characterized by the following features: 1. The optimization objective is usually the CRLB of the estimated target location. 2. The main optimization variables are electromagnetic parameters and geographic parameters in the radar domain. The electromagnetic parameters here refer to parameters such as the radar power, waveform, bandwidth, etc., while geographic parameters mainly refer to the location of UAVs. However, these methods for distributed radar resource allocation may incur significant problems when directly applied to airborne radar platforms. The biggest difference between DAR and DGBR is that DAR achieves information transmission and instruction distribution through wireless communication, and stable wireless communication is the prerequisite and foundation for airborne radar to perform tasks. In this context, UAVs need to maximize radar detection performance while meeting system communication performance constraints. As radar and communication resources both belong to the electromagnetic domain, the joint planning of radar and communication systems is a significant research field. The current research rarely involves this field, which means that the effective planning and utilization of electromagnetic domain resources in airborne radar systems have not been achieved. This may result in the loss of radar detection performance or the failure of the UAV swarm to fully meet communication performance constraints during the whole process of mission execution.
Currently, very few research studies, represented by Sheng Xu et al. from the University of South Australia, have considered communication constraints in DAR localization systems. For example, they modeled the communication range as an area surrounded by a circle that is centered around the UAV itself [31][32][33]. When the UAV communicates with other UAVs, they fuse the target location information received by the UAV. Nevertheless, these studies only consider the impact of communication performance on information fusion without the collaborative optimization of joint radar-communication electromagnetic domain resources.
Given the above research background, this article focuses on the JRCO algorithm in the context of the passive localization of targets by DAR. A fully distributed JRCO algorithm was proposed to achieve optimal localization performance under communication performance constraints. Compared with traditional methods, The proposed method could significantly improve the target localization performance of the UAV swarm.
The main contributions of this article are as follows:

1.
A target localization system model for DAR is proposed. Compared with traditional DAR system models, the proposed system model focuses on considering the impact of UAV communications coverage on radar detection performance, the impact of UAV communications coverage on the topology of the UAV communication system, and the impact of the UAV location error on target localization performance.

2.
A fully distributed JRCO algorithm is proposed. The main idea of this distributed algorithm is to enable each UAV to simultaneously sense the information broadcasted by neighboring UAVs in each time slot and make decisions simultaneously, enabling the joint optimization of radar and communication system parameters, fully utilizing the electromagnetic domain's freedom. Meanwhile, compared to centralized algorithms, distributed algorithms can continue to perform tasks in the event of a few node failures, avoiding the extreme situation of system paralysis caused by a few key node failures.

3.
Compared with the current research of Sheng Xu et al., numerical simulations have demonstrated that the proposed algorithm can achieve better target localization performance under communication constraints.
The article was organized as follows: Section 2 introduces the DAR system model under a passive localization background. Section 3 proposes a fully distributed JRCO algorithm and explains the details of each model. Section 4 proves the effectiveness of the proposed algorithm compared to traditional radar domain resource allocation meth-ods through numerical simulations. Section 5 summarizes the entire work and provides prospects for research in the future.

System Model
This section first provides a visual description of the scenario, and the specific mathematical modeling is explained in the subsections.
As shown in Figure 1, the passive localization model of an active target by UAV swarms was considered. Each UAV in the swarm performed AOA measurements on the target in each time slot. Each UAV carried radar and communication equipment, where the radar was applied to measure the AOA of the target. The communication equipment was applied to broadcast the information obtained to its neighboring UAVs (specific information will be explained in Section 3.3). In each time slot, each UAV needed to plan the power of the radar system, the power of the communication system, and motion based on the information it sensed (i.e., the information broadcasted by the communication system). The ultimate goal of JRCO was to maximize the accuracy of target localization. In the initial state, all the UAVs departed from the departure area. When the iteration of the JRCO algorithm terminated, all UAVs returned to the departure area and fused all obtained AOA information in the base station. In this context, this article designed a fully distributed JRCO method to maximize the accuracy of target location estimation (Namely the CRLB, which will be derived in Section 2.6). "Distributed" refers to each UAV making independent decisions based on its sensed information at each time slot without being commanded by any other UAV. The reason for considering distributed algorithms instead of centralized algorithms is that when there are a large number of UAVs, centralized algorithms rely on multiple relays for command distribution, which can cause significant communication delays and load.

The Model of UAV Location
As shown in Figure 1, there were a total of M UAVs in the system, and the algorithm executed T time slots in total. The meaning of T is the maximum time allowed for UAVs to stay in the air. This indicator was set because the UAV swarm executed tasks far away from the base station. The UAV swarm then returns to the base station to unload data only after the task is complete. Therefore, restrictions must be implemented for T to prevent the energy depletion of UAVs.
For the safety of the UAV itself, the motion of the UAV swarm is restricted within a rectangular area composed of four vertices: [x min , y min , z U AV ], [x min , y max , z U AV ], [x max , y min , z U AV ], and [x max , y max , z U AV ]. This area is defined as a feasible region, as shown in Figure 1. The location coordinate of the i-th UAV in the t-th time slot can be defined as r t i = x t i , y t i , z t i T , and the location state vector of all UAVs is: The location of the target was defined as u = [x u , y u , z u ] T . Considering that the location of UAVs might be affected by force majeure, such as wind, the vector composed of the locations of all UAVs in the t-th time slot could be represented as: where r t , n t R represent the mean vector and location error, which can be represented as: All ∆x t i , ∆y t i , and ∆z t i are independent and identically distributed (I.I.D.) Gaussian random variables had a zero mean and variance σ 2 R . The covariance matrix of n t R could be obtained as:

The Model of UAV Communication Coverage
In order to facilitate the information fusion between each UAV and its neighboring UAVs, each UAV needed to broadcast its own sensed information. These specific types of information sensed are introduced in Section 3.3. Due to the limited payload of each UAV, the communication coverage of each UAV was usually limited. The effective communication coverage of each UAV was an area enclosed by a circle of a certain radius centered around itself. This radius could be defined as the communication coverage radius. The communication coverage radius of the i-th UAV in the t-th time slot could be defined as ω t i , and the state vector of the communication coverage radius of all UAVs in the t-th time slot could be defined as ω t ; then: Obviously, the larger the communication radius, the greater the communication power required by the UAV. For the convenience of subsequent representation, all values of ω t i were obtained from a discrete value set, which formed a vector:

The Definition of UAV Dynamic Communication Topology
The location state vector and the state vector of the communication coverage radius determined the communication connectivity of the whole UAV swarm. The communication connectivity of the UAV swarm was similar to that in graph theory. Taking a directed graph as an example, when j-th UAV could sense i-th UAV, it was believed that there was a directed path from the i-th UAV to the j-th UAV. When the communication topology of the UAV swarm was non-connected topology, strongly connected topology, and completely-connected topology, respectively, they corresponded to the concepts of the non-connected graph, strongly connected graph, and completely-connected graph in graph theory, respectively. For more basic concepts of graph theory, please refer to [34].

The Model of Radar Power and Communication Power
The radar detection power of the i-th UAV in the t-th time slot could be defined as p t i , and the communication transmission power defined as q t i . In each time slot, the sum of radar, the detection power, and communication transmission power of each UAV was a fixed value, i.e., p t i + q t i = E max . E max was defined as the electromagnetic domain of power.

The Model of AOA Measurement
For the i-th UAV, when it sensed the information broadcasted by the neighboring M i UAVs (specific information will be introduced in Section 3.3, M i UAVs including the i-th UAV itself), the i-th UAV could fuse the information broadcasted by M i UAVs and obtain an estimation of the target location. It should be noted that the focus of this article was not on target localization algorithms. However, Section 3 used the CRLB of the estimated target location as the objective function to achieve the joint optimization of each UAV location and electromagnetic domain power. The calculation of CRLB required the target location to be known; therefore, it was necessary to estimate the target location before executing the joint optimization algorithm. The AOA measurement based on multiple UAVs was the foundation of target location estimation; therefore, it is necessary to model the measurement process of the AOA first. The specific target location estimation method is provided in Section 3.4. We first provided a measurement model for azimuth AOA.
The azimuth AOA measurement model for the i-th UAV in the t-th time slot was: where θ t i is the azimuth AOA measurement vector of the neighboring UAVs sensed by the i-th UAV in the t-th time slot. θ t i is the mean vector of the azimuth AOA measured by UAVs around the i-th UAV in the t-th time slot (the i-th UAV cannot sense this value). The measurement error carried by the i-th UAV in the t-th time slot wasñ t θ,i . The elements of θ t i and θ t i could be represented as: where x t i,k and y t i,k represent the x and y coordinates of the surrounding k-th UAV sensed by i-th UAV in the t-th time slot, respectively.
Each element inñ t θ,i is an independent zero-mean Gaussian random variable, while the elements inñ t θ,i could be represented as: where n t θ,i,k represents the AOA measurement error of the neighboring k-th UAV sensed by the i-th UAV in the t-th time slot.
The variance of n t θ,i,k was σ t , which represented the azimuth AOA measurement accuracy of the UAV system. The covariance matrix of n t θ,i was: whereC t θ,i represents a diagonal matrix with the dimension M i × M i . Equations (12) and (13) describe the azimuth AOA measurement error of the neighboring UAVs from the perspective of the i-th UAV. At this point, it was also necessary to describe the azimuth AOA measurement error of all UAVs. From the global perspective of the UAV system, the azimuth AOA measurement error of the i-th UAV in the t-th time slot in the swarm could be defined as n t θ,i . If n t θ represented the state vector composed of the azimuth AOA measurement errors of all UAVs in the t-th time slot, there was: The covariance matrix of n t θ was: The measurement model of elevation AOA was similar to the above content, and in order to avoid repetition, the definitions of were not repeated. Meanwhile, in order to simplify the problem appropriately and reasonably, it was believed that σ t θ,i = σ t ϕ,i held for i = 1, 2, · · · , N. Each UAV was of the same measurement error for the elevation and azimuth AOA in a time slot.

Derivation of CRLB for Target Position Estimation under AOA Model
The ultimate goal of the JRCO algorithm was to achieve a more accurate estimation of the target location by fully utilizing the degrees of freedom in radar-communication systems. Under this condition, it was necessary to theoretically derive the CRLB for target position estimation under the AOA mode. This CRLB would be the objective function for JRCO.
At this point, the measurement model of the radar system was: Among them, x, v, u, and n could represent the measurement vector, measurement function, target state vector, and noise vector, respectively. x could be either an echo signal or a measurement value of the parameters, such as the bistatic range. u is the true value of the target location, which could be represented by The elements in Equation (16) are: where The covariance matrix corresponding to n is C.
The elements in the Fisher information matrix J(u) of u could be calculated based on the following equation [35]: where tr(·) represents the trace of the matrix and u i represents the i-th variable in u.
The CRLB matrix C CRLB was: This article uses the trace of C CRLB as a quantitative standard for the performance of target localization. At this point, J(u) is a 3 × 3 matrix. Due to the complexity of deriving the elements in J(u) and finding the inverse of J(u), the numerical value of the trace could be directly calculated as the optimization objective function in subsequent algorithms. The process of optimizing the objective function to obtain a gradient could be achieved using mature convex approximation methods such as successive convex approximation (SCA), which is not repeated here. and ω t i . The relationship between a r and a c of the i-th UAV in the t-th time slot was: Among them, a r and a c represent parameters relating to the power consumption of radar systems and communication systems, respectively. The magnitude of these two values could be fitted in practical applications by measuring the power consumption of hardware systems.

Formal Representation of JRCO
The state of the UAV swarm in time slot t could be represented by vector S t : The formal representation of JRCO at the t-th time slot: where J (·) is the optimization objective function, which is the CRLB of the target location estimation for all UAVs in the swarm. This function could be calculated based on the trace of C CRLB . d r t i , r t−1 i , which refers to the Euclidean distance that the i-th UAV moves from the t-th time slot to the t + 1-th time slot. Due to the limitations of the UAV payload, the maximum motion distance of each UAV in each time slot was limited and could not exceed d max .
It can be noted that the parameters r t i , ω t , C t θ , C t ϕ in Equation (24) were based on information obtained from a global perspective (namely, this information was obtained by all the UAVs in the swarm). For a single UAV, it could only sense the information broadcasted by its neighboring UAVs. Therefore, directly adopting the optimization form in Equation (24) would be unrealistic.
If r t i is used to represent the set of M i UAV locations sensed by the current i-th UAV in the t-th time slot (including the current UAV itself), the elements inr t i could be represented as:r . From the perspective of the i-th UAV in the t-th time slot, the JRCO could be decomposed into: where J (·) represents the objective function of the i-th UAV for optimization. It should be noted that the optimization problem shown in Equation (26) was executed simultaneously for all the UAVs in each time slot. All UAVs sensed the information and made decisions simultaneously.
Section 4 demonstrates the numerical results that solved the optimization problem shown in Equation (26) through distributed algorithms that could ultimately achieve the sub-optimal solution of the optimization problem shown in Equation (24).

JRCO Algorithm
The relationship between the various modules of the JRCO Algorithm is shown in Figure 2. This section elaborates on the logical relationships between the various modules described in Figure 2 first and the overall process of the algorithm. A detailed analysis and description of the functions of each module are presented later in this section. Unless otherwise specified in this section, let i = 1, 2, · · · , M, j = 1, 2, · · · , N, t = 1, 2, · · · , T.
The overall process description of the optimization algorithm was as follows: Step 1: Before the JRCO algorithm starts iteration, the system state should be initialized, including the locations of each UAV, the communication coverage radius of each UAV, and the location error variance of each UAV. This step corresponds to Box 1 in Figure 2. Next, the algorithm generates policies for all M UAVs simultaneously at each time slot t, and the process of generating the policies involves traversing all N possible values of ω t i . When the values of i, j, and t are determined, the current i-th UAV begins to enter the following sensing and decision-making steps.
Step 2: In this step, the i-th UAV calculates the variance σ t θ,i 2 of the AOA measurement error itself based on the communication coverage radius r t i selected for the current t-th time slot. This step corresponds to Box 2 in Figure 2.
Step 3: The i-th UAV achieves a target location estimation by networking with its neighboring M i UAVs. These M i UAVs broadcast their own AOA measurement results and location information to the i-th UAV. Specific information before this is introduced later in this section. This step corresponds to Box 3 in Figure 2. Step 4: Based on the location, the variance of AOA measurement errors, and the AOA measurement value of these M i UAVs, the i-th UAV achieved the estimation of the target location. This estimate could be used to generate the objective function. (This link corresponds to Box 4 in Figure 2). This was because the calculation of the CRLB for target location estimation depended on the location of the target itself, and, in this case, the estimated value of the target location could replace the true value of the target location in the CRLB.
Step 5: Due to the location errors of M i UAVs themselves, in order to calculate the objective function more accurately, the impact of the location errors of each UAV itself was incorporated into the AOA measurement error. The variance of the AOA measurement error at this point was defined as the composite AOA measurement variance. This step corresponds to Box 5 in Figure 2.
Step 6: Since the communication coverage of each UAV is limited and the UAV needs to achieve optimal deployment through motion, a decision-making mechanism needs to be constructed when the UAV cannot sense any information broadcast by its neighboring UAVs (This state is defined as the current UAV in a disconnected state). When the UAV is not in a disconnected state during a certain time slot, it stores information about itself and the neighboring UAVs in its own memory (The specific types of information are introduced in the second half of this section). When the UAV is in a disconnected state in a certain time slot, the UAV calls the latest networking information in its memory. This link corresponds to Box 6 in Figure 2.
Step 7: With the guarantee of all the above steps, all UAVs can fully construct an objective function. This link corresponds to Box 7 in Figure 2.
Step 8: The i-th UAV generates the current activity based on a random projection gradient descent (RPGD) method under the given j and t. That is, a policy that adjusts ω t i and r t i . This step corresponds to Box 8 in Figure 2.
Step 9: When j > N, the i-th UAV in the t-th time slot has completed all possible attempts at a communication coverage radius. Based on the estimated CRLB corresponding to each communication coverage range for the i-th UAV, the optimal policy (including the selection of communication coverage range and motion) for the i-th UAV can be determined.
Step 10: When i > M, all UAVs in the t-th time slot complete the selection of an optimal policy. At this point, the states of all UAVs are updated simultaneously, and the system enters the t + 1 time slot.
Step 11: When t > T, the JRCO algorithm terminates. At this point, the UAV swarm returns to the departure area and transmits the AOA measurement information of the T-th time slot to the base station in this area.

Initialization of System State
This module initializes the system state when t = 0. Based on the definition of the system state in Equation (23), it is necessary to specify values for r 0 , ω 0 , and C 0 R . For simplicity, all values in ω 0 can be made equal, i.e., ω 0 1 = ω 0 2 = · · · = ω 0 M = ω ini . ω ini represents the initial value of the communication coverage range for all UAVs. Regarding r 0 , it is believed that when t = 0, all UAVs are distributed in a two-dimensional departure area with a side length of b. For i = 1, 2, · · · , M, x 0 i , y 0 i and z 0 i follows a uniform-distribution on the interval [0, b]. The value in C 0 R can be specified based on the actual scenario of the UAV.

Variance Calculation of AOA Measurement
The function of this module was to determine the AOA measurement variance σ t θ,i 2 of the UAV based on the ω t i selected by the i-th UAV in the t-th time slot when i, j, and t were all determined. The relationship between these two parameters is already shown in Equation (22).

Sensing of Neighboring UAV States and Networking
The function of this module was to sense the state of neighboring M i UAVs based on the selected ω t i when i, j, and t were all determined. The information sensed by the i-th UAV included ther t i andC t θ,i of M i UAVs around it. To construct the objective function for the current i-th UAV, the i-th UAV also needed to estimate the target locationû based on the surrounding M i UAVs. Therefore, the i-th UAV also needed to measure the AOA with neighboring M i UAVs, and the measurement model is given in Equation (8). The estimation method for the target location is provided in Section 3.4.

Estimating Target Location
Based on the measurement model given in Equation (8) and the AOA measurement values of neighboring M i UAVs, this subsection designed relevant parameter estimation algorithms to estimate the target location. In each time slot, each UAV could only construct an objective function based on Equation (21) by estimating the target location. At present, most of the current research on optimizing the CRLB [36][37][38][39] has assumed that the accurate target location is known. It is obviously unreasonable that the target location is known before the UAV swarm can obtain an accurate target location estimation.
In the case of neighboring M i UAVs forming a network together and, based on the geometric relationship between each UAV and the target, there was: Tan operations can be performed on both sides simultaneously and multiplied by x u − x t k to obtain: where k = 1, 2, · · · , M i . After some mathematical manipulations, there were: where: When the variance of AOA measurement errors for each UAV was relatively small, the following approximate relationship held: Thus, the following pseudo-linear equation could be obtained: where: Due to k = 1, 2, · · · , M i , Equation (36) corresponded to M i equations, namely: where: The above process was repeated and we obtained: where: represented the vector of the location parameter, and Equations (40) and (44)  T T + Tn t l,k , then after some mathematical manipulation, the covariance matrix of λ could be obtained as: After some further mathematical manipulation, the cost function of the WLS method could be obtained: where x was the estimated value of x. The cost function could be minimized to obtain x. The solving process of WLS need not be elaborated here.

Composite Variance Calculation of AOA Measurement
At present, all input variables for constructing the objective function of i-th UAV in the t-th time slot could be obtained, namelyr t i ,û, andC t θ,i . According to Equation (21), the objective function could be constructed at this time. However,C t θ,i did not include the impact of the UAV location error on AOA measurement at this time.
This section considers the impact of the UAV location error on the AOA measurement based on the covariance matrixC t θ,i . For a local networking system composed of M i UAVs around the i-th UAV, the errors in the x, y and z coordinates of each UAV could be defined as ∆x t i,k , ∆y t i,k and ∆z t i,k respectively, and the AOA measurement error was n t θ,i,k . Thus, they were all Gaussian random variables with a zero mean, i.e., ∆x t i,k ∼ N 0, σ 2 l , For the theoretical basis of Gaussian random variables for the above random variables, please refer to the literature [40]. The AOA estimation error caused by the location error of the k-th UAV in the local networking system of i-th UAV could be defined as n t s,i,k . At this point, the azimuth AOA measurement error could be determined by n t θ ,i,k = n t θ,i,k + n t s,i,k , and the elevation AOA measurement error could be determined by n t ϕ ,i,k = n t ϕ,i,k + n t s,i,k . Based on the geometric relationship, there were: The derivation of PDFs for n t ϕ,i,k and n t θ,i,k was not an easy process, but numerical simulations could verify that n t ϕ,i,k and n t θ,i,k approximately followed Gaussian distributions. The PDF based on sample fitting and Gaussian distribution fitting was verified under the condition of x u = 50, y u = 50, z u = 50, σ l = 1 (shown in Figure 3). The locations of the sensor are all x t i,k = 0, y t i,k = 0 in Figure 3. Based on the properties of the Gaussian distribution, it could be concluded that both n t θ ,i,k and n t ϕ ,i,k follow the Gaussian distribution at this time. Their variances were defined as σ t

Storage and Retrieving of Neighboring UAV State
As mentioned earlier, in order to ensure that each UAV could smoothly construct an objective function in a disconnected state, each UAV needed to store and retrieve the state of the neighboring UAVs. The process was as follows: when the i-th UAV was not in a disconnected state in the t-th time slot, the i-th UAV needed to temporarily storer t i , u, andC t θ,i at this time. Once the i-th UAV was in a disconnected state in the t-th time slot, the i-th UAV could continue to complete the construction of the objective function by calling the information ofr t i ,û, andC t θ,i in the t-th time slot. Each UAV stored and retrieved information independently. If the UAV was not in a disconnected state in two consecutive time slots (such as the t-th and t + 1-th time slots), the stored information in the t + 1-th time slot could delete the information in the t-th time slot to save memory.
The vector corresponding to the storage information of the i-th UAV location in the t-th time slot could be defined as: The vector corresponding to the storage information of the composite measurement variance of AOA in the t-th time slot could be defined as: The vector corresponding to the storage information of the target location estimation in the t-th time slot was defined as: Even if the i-th UAV could not sense any state information shared by other UAVs within a certain time slot, it could still smoothly generate optimization objectives based on the rules above.

Generate the Optimization Objective Function
So far, the i-th UAV was of the ability to build optimization objectives based on the information of the neighboring M i UAVs (or its own stored information). Based on Equation (21), whenr t i ,C t θ ,i ,C t ϕ ,i ,û were known, the CRLB of the target location estimation was also known.

Generate the Policy for the i-th UAV
For the optimization objective function of the i-th UAV in the t-th time slot, the method based on a random projection gradient descent (RPGD) could be applied to optimize the location of the UAV. The pseudocode of the algorithm is shown in Algorithm 1. Randomly sampling d t on a unit sphere 05 Return r t+1 i η max and ε in Algorithm 1 represent the maximum step size and tolerance in the RPGD method, respectively. t and T represent the iteration variable and iteration number, respectively. T is determined by T = 1/η 2 , where η is the step size of each iteration, which can be determined by η = min ε 2 / log(1/ε), η max . l max is the maximum step size of the motion within a time slot.
When the UAV is in a disconnected state, it can adopt the following three policies. To balance these three motion policies, ε 1 , ε 2 > 0 and ε 1 + ε 2 < 1 can be defined. When M i = 1 for the i-th UAV, the system randomly generated a uniformly distributed random number ε between 0 and 1. At this point, the i-th UAV generated the next location policy based on the size of ε in the following three methods: Method 1: When ε ≤ ε 1 , the i-th UAV generated r t+1 i based on the algorithm shown in Table I.
Method 2: When ε 1 < ε ≤ ε 2 , the i-th UAV moved by l max in the direction of the geometric mean of the stored information of the location; Method 3: When ε > ε 2 , the UAV moved by l max in a random direction (with a uniform distribution at intervals [−π, π]).

Pseudocode of JRCO Algorithm
The pseudocode of the JRCO algorithm is shown in Algorithm 2.

Numerical Results
This section validates the effectiveness of the proposed JRCO algorithm through a numerical simulation. This section consists of one experimental group and four control groups. The MSE curves of the target location estimation were plotted separately. All numerical simulation results in this section were run on a processor 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30 GHz, using the numerical simulation tool MATLAB R2020a.

The First Set of Parameters
The simulation parameters of the experimental group are shown in Table 1. The focus of parameter selection was mainly on the geometric scale between the UAV and the target. The selection of parameters was mainly based on the currently distributed radar literature, and the literature selected for this subsection was [41].
The meanings of the experimental group and the four control groups were as follows: Experimental group: Under the simulation parameters shown in Table 1, the joint optimization method in Algorithm 2 was applied.
Control group 1 (Sheng Xu's method [31][32][33]): Based on the simulation parameters shown in Table 1 and the JRCO algorithm in Algorithm 2, the communication coverage range of each UAV was kept unchanged (set at 150 m), and this control algorithm only planned the location of each UAV. At this point, the radar detection power and communication transmission power of each UAV remained unchanged with time, and the other conditions remained consistent with the experimental group. At this point, the optimization method only took the estimated CRLB as the optimization objective and planned the location (or path) of the UAV. This method was the same as the method in references [31][32][33]; therefore, the control group aimed to compare the proposed JRCO algorithm with the current research of Sheng Xu et al.
Control group 2: Under the simulation parameters shown in Table 1, the JRCO algorithm in Algorithm 2 was performed, but the communication coverage range of all UAVs was infinite. The measurement error of AOA for each UAV was calculated based on the communication coverage radius of 150 m, which was equivalent to a centralized algorithm. Meanwhile, the optimal locations of all UAVs were generated through the same RPGD method. That is, all UAVs applied an equal step size in the RPGD method when executing the algorithm, and other conditions were consistent with the experimental group. Control group 3: Under the simulation parameters shown in Table 1 and based on the JRCO algorithm in Algorithm 2, the communication coverage range of all UAVs was infinite. The measurement error of the AOA for each UAV was calculated based on a communication coverage radius of 150 m, which was equivalent to a centralized algorithm. However, the location of only one UAV was planned in each time slot, and all UAVs were planned in a sequence of a different time slot, while the other conditions were consistent with the experimental group.
Control group 4: Under the simulation parameters shown in Table 1, the JRCO algorithm in Algorithm 2 was performed, but the communication coverage range of all UAVs was infinite. The measurement error of AOA for each UAV was calculated based on the communication coverage radius of 150 m, which was equivalent to a centralized algorithm. Other conditions were consistent with the experimental group. Figure 4a shows the trend of MSE for the target location estimation when the time slot range was 0-100. It can be seen that at the beginning of the iteration, the MSE curves corresponding to the experimental group, control group 1, and control group 4 decreased significantly faster than the MSE curves corresponding to the control group 2 and control group 3.
Based on Figure 4b, it can be seen that, at the beginning of the iteration, the experimental group showed better performance than control group 1 (Sheng Xu's method [31,33]): For example, in the 10th time slot, the MSE generated by the experimental group was about 440 m lower than that of control group 1. In the 20th time slot, the MSE generated by the experimental group was approximately 270 m lower than that of control group 1. This fully proved the effectiveness of the JRCO algorithm, which achieved a better performance in target localization than the current research of Sheng Xu et al. The JRCO algorithm adopted by the experimental group achieved a good target localization performance by fully utilizing the degrees of freedom in radar communication systems.
The MSE curves of the experimental group, control group 2, and control group 3 were compared, as shown in Figure 4a. It was obvious that the MSE of the experimental group decreased the fastest overtime at this time slot scale. Control group 2 applied a centralized algorithm, but its corresponding MSE values showed significant fluctuations. This was because the same step size across all the time slots led to the poorer convergence performance of the JRCO algorithm. This resulted in no convergence trend in the curve of control group 2, and the iterative efficiency of the algorithm in this group was relatively low. Although control group 3 also adopted a centralized algorithm, the location of only one UAV was optimized in each time slot. This greatly reduced the iterative efficiency of the algorithm.
Finally, the MSE curves corresponding to the experimental group and control group 4 were compared. It could be concluded that these three curves were relatively close on the scale of 0-100 time slots. Therefore, Figure 4b,c provide the MSE curves in the range of 10-20 time slots and the 100th time slot, respectively. From the condition settings of the five simulation experiments, it could be seen that the optimization algorithm applied in control group 4 could provide the best performance (i.e., the corresponding MSE curve decreased the fastest over time). For Figure 4c, it could be seen that at the end of the algorithm, the MSE curves of control group 4 and control group 1 were actually closer, and the experimental group obtained the lowest MSE. The MSE of the experimental group (5. To demonstrate other details generated by the joint optimization algorithm, Figure 4d provides the initial and final locations of the UAV swarm. Figure 4e shows the average operation time of each time slot as a function of the number of UAVs (only the results on the xOy two-dimensional plane were plotted).

The Second Set of Parameters
The simulation parameters of the experimental group are shown in Table 2. In terms of parameter selection, the literature selected for this subsection was [42]. The parameters not given in Table 2 were consistent with Table 1. The setting method for one experimental group and four control groups was consistent with Section 4.1, with a default communication range of 1.5 km. The corresponding numerical simulation results are shown in Figure 5.   Figure 5a shows the MSE curves generated by each group when the time slot range was 0-100. Obviously, at this time, the curves corresponding to the experimental group, control group 3, and control group 4 decreased rapidly. Figure 5b shows a comparison of the curves between the experimental group and the control group 4 when the time slot was 10-20. It can be seen that in the early stages of algorithm iteration, the experimental group showed a better performance than any control group. Figure 5c shows the final results of MSE generated by each group for the 100th time slot. At this time, the MSE of the experimental group (25.92 m) decreased by approximately 39.48% (16.91 m) compared to the MSE of control group 4 (42.83 m). The above numerical simulation results fully demonstrate the effectiveness of the proposed JRCO algorithm, and the joint optimization algorithm was suitable for a wide range of scenarios. Figure 5d shows the initial and final locations of the UAV swarm.

Conclusions and Future Works
This article proposes a novel joint optimization algorithm for the radar detectioncommunication transmission of DAR. Compared with current research on the resource allocation of DAR, the JRCO algorithm is mainly of the following two prominent features: firstly, it synergistically utilizes the degrees of freedom in both the radar system and communication system. It overcomes the limitation of traditional radar resource allocation that only focuses on the optimization variables of radar systems. The proposed JRCO algorithm could achieve a better radar detection performance while ensuring communication performance constraints. The second is that the JRCO algorithm was completely distributed, which was in stark contrast to the current centralized algorithms for resource allocation. Distributed algorithms can enable the system to be of a higher task execution efficiency, lower communication transmission delay, and stronger anti-damage ability.
In order to implement the JRCO algorithm, this article first took the estimation of the target location in the passive AOA measurement mode as an example; this comprehensively modeled the system from various aspects, such as the location error and communication coverage of the UAV and provided a formal representation of the joint optimization problem. Then, the JRCO algorithm was elaborated into eight subsections. Finally, the effectiveness of the JRCO algorithm was verified by a numerical simulation in formal verification that is, the JRCO algorithm could achieve a better target localization performance.
However, the main drawback of the JRCO algorithm was that the iteration and update process was relatively slow. From Figure 4e, it could be seen that as the number of the UAV swarm increased, the iteration efficiency of the JRCO algorithm significantly decreased. In future work, we intend to focus on improving this issue. In addition, more complex radar systems and communication system topologies that are suitable for airborne radar will be considered. For radar systems, it is also possible to optimize the CRLB for target position estimation under active detection background. For communication systems, the presence of relays should be considered.