Offense-Defense Distributed Decision Making for Swarm vs. Swarm Confrontation While Attacking the Aircraft Carriers

: The effectiveness of multiple UAVs at attacking the enemy target and defending against the attacking UAVs simultaneously gained researchers’ attention due to their cost efﬁciency and high mission success rates. This paper explores the UAV swarm vs. swarm offense/defense confrontation decision-making while attacking the aircraft carriers on open seas. The system is developed as a multi-agent system. Precisely, every UAV is modeled as an independent agent, which is free to make decisions based on the behavioral rules of the swarm, detection radius, and enemy location. A distributed auction-based algorithm with a limited data-sharing rate ability is formulated among swarm UAVs. A critical feature of the proposed strategy is that it empowers each UAV to make decisions in real-time during the mission in the presence of relatively limited communication hardware in terms of the update rate within the UAV communication system. The algorithm parameters are optimized to obtain an improved target allocation and elimination. Simulation results are presented to show the effectiveness of the proposed Offence-Defense UAV-based swarm Strategy.


Introduction
The Unmanned Aerial Vehicle (UAV) swarm technology has seen tremendous development due to its applications in civil and military applications, such as surveillance [1,2], search and rescue [3][4][5], fire-fighting and combat operations [6,7]. The UAV has limited potential due to its limited search range and load capacity. UAV swarms are more adaptable to the environment and have more control and flexibility due to the available UAV redundancy. This makes UAV swarms suitable for complex missions with multiple constraints [8,9]. The inexpensive UAV technology and its wide application prospects have drawn attention from researchers and organizations, for instance, the U.S air force's small unmanned aircraft system (SUAS) flight plan [10], NASA UTM [11] and European U-space initiatives [12]. The employment of UAV swarms to break through enemy defense systems and perform a saturated attack or/and intercept intruders is theoretically feasible and realizable in-near future military prospects. Thus, the necessary focus is needed to develop efficient algorithms for UAV swarm-based offense-defense operations.
The employment of multiple UAVs with mutual communication increases their environmental perception, ability to perform coordinated task assignments, and collaborative reconnaissance missions. The operation of UAVs in a swarm improves their overall combat effectiveness and survivability [13]. In an environment with multiple UAV swarms, cooperating with allies and attacking the enemies forms a multi-agent system. The environment and mission constraints are usual occurrences during UAV swarm missions. It is a challenging and complex task and developing algorithms that can perform the various missions in a multi-agent dynamic environment is of critical research significance. The UAV swarm offense-defense problem is a territorial defense problem [14]. The target search mission was investigated in [15], with a biologically inspired algorithm presented to solve the problem. An electronic attack application of a mini UAVs swarm was discussed in [16]. It was shown that employing UAV swarms to perform an attack can reduce the associated risks and costs while improving mission sustainability. A defensive algorithm for the UAVs was presented to deal with the attacking swarm in [17]. The scenario was later studied for UAV task negotiation in a swarm [18]. An agent-based model was developed to study the effectiveness of the defense success rate in [19]. Further, the UAV swarm versus the swarm algorithm using agent-based modeling was presented in [20]. A hierarchical approach combining a Nero-evolution-based augmented topologies with model-predictive path controller was presented in [21]. Although the discussed literature can provide enough operational effectiveness, decision-making would become challenging with the increase in the UAV swarm scale. In the UAV-swarm-based offense-defense confrontation problem, target allocation is a critical issue. In addition, maximizing the profit and defining behavioral rules to transform the combat behavior of the UAV swarm are other essential issues.
The swarm-based operation usually involves several UAVs; therefore, decentralized allocation methods are more favorable than centralized methods to ensure the the system's stability. A consensus-based algorithm (CBA) and distributed CBA are investigated for multi-target allocation problems in [22]. However, target information is available to the agents in real-time, which may not be the case in a real combat scenario. Thus, a bundle CBA (BCBA) [23] and extended BCBA (EBCBA) [24] were presented to deal with targets with no prior information. Recently, stochastic clustering auction (SCA) is used to develop the algorithm in which convergence speed could form a trade-off against solution optimality [25]. The CBA can keep the solution space computationally tractable, while SCA facilitates the essential trade-off based on practical mission requirements [26]. Therefore, the target allocation algorithm should be computationally stable and flexible to accommodate the required trade-offs.
Another critical issue is keeping the UAVs in the swarm as a cohesive unit and defining behavioral rules to make the UAV swarm work as a unit. The Biods model, which consists of simple rules, i.e., alignment, cohesion, and separation, was presented in [27,28] based on the observation of bird flocks. A high number of swarm motion models were presented using the three rules, such as Vicsck [29] model and Couzin model [30]. The speed of an individual UAV is constant in a swarm using the Vicsek model, and velocity is dependent on the average velocity of the neighboring UAVs, whereas the Cuzin model divides the individual UAV perception into three no-overlapping zones resulting in the possibility of flexibility in their behavior as the zone width changes. A hybrid control scheme was developed for a formation flight in swarms in [31,32]. Some distributed algorithms for a UAV swarm, derived from the collective motion of bird flocks, were presented in [33,34], providing a foundation for self-organized missions using UAV swarms.
In a swarm of UAVs, distributed control plays a critical role since it can help to perform complex tasks employing low-cost UAVs. These UAVs have a limited payload-carrying capacity, restricted avionics, and communication abilities. Multiple low-cost UAVs can perform uphill tasks, which, in other cases, could not be independently performed by sophisticated Systems. In this particular research paper, we developed a UAV-swarmbased offense-defense combat environment with restricted communication rates due to cost limitations. Therefore, the number of UAVs that a UAV can select as its neighbors is limited based on the communication system restriction. The effectiveness of a UAV swarm, in an equal-opportunity-based offense-defense combat environment, with the mentioned limitations and variation effects in the parameters of the distributed estimated auction algorithm, was studied.
The devised Offense-Defense Distributed Decision Making (ODDDM) has following salient features:

•
The ODDDM algorithm solves the offense-defense confrontation problem (ODCP) with the moving high-priority target (HPT) like an aircraft carrier, which needs to be addressed further in the literature. A real-time combat model is established to update the states of all the UAVs and HPTs for both sides. The dynamic nature of the swarm is well maintained in case of UAV(s) loss. The work is an extension of the previously published work with some realistic amendments [13]. For instance, in previous work, continuous data communication among neighbors was required unless the target was finalized. The communication systems for high update rates are relatively expensive from a practical viewpoint. However, the UAVs in the swarm are cost-efficient, rendering the possibility of continuous communication hard to realize. The proposed ODDDM algorithm requires one-time data communication among neighbors for target finalization of the selected UAV in one time step. • The target allocation decision-making in the proposed ODDDM algorithm leads to the offense/defense behavior of a particular UAV in the swarm. This determines the target selection and firing of the weapon at the target in range. To this end, the proposed ODDDM algorithm uses a distributed estimation-based allocation approach to obtain data on enemy UAVs in the neighborhood and allocate a target for the particular UAV to maximize profit. In contrast, in our earlier work [13], the cumulative allocation algorithm was employed. The proposed algorithm is more applicable in practical combat situations, as the enemy UAVs are unknown, unlike in the earlier work. The proposed algorithm is efficient as it considers only the enemy UAVs in the detection range, rather than all the enemy UAVs. • The UAV-motion decision making generates UAV guidance commands. The guidance command has two main components, i.e., one is generated from the target position and second is generated from the neighbors to deal with the swarm principles of cohesion, separation and alignment. Only the real-time location of the neighboring UAVs is required for this part of the algorithm.
The algorithm is tested after implementation in the simulation, which acts as a test bench for the described algorithm. This study can help to analyze and devise multiple strategies before the actual mission on the battlefield.
The paper is organized as follows. In the UAVs swarm vs. swarm confrontation section, we describe the combat scenario and the dynamics of a UAV in a swarm. In the algorithm structure section, the proposed algorithm is described, including both the target allocation and swarm motion decision making. The effectiveness of the decisionmaking algorithm is simulated and analyzed in the simulation and discussion section. Conclusions regarding the algorithm parameters and enhanced system capabilities are presented in the final section.

UAVs Swarm vs. Swarm Confrontation Problem Description
Initially, we define the offense-defense problem using UAV swarms to study the combat operations using UAVs. Later, exploiting the combat environment, a dynamic model is developed to observe the state behavior of UAVs in the swarm and ensure dynamic combat based on the decision making process.

UAV Swarm-Based Combat Scenario
The combat scenario consists of two moving aircraft carriers in a confrontation on the sea surface, equipped with UAV swarms. For convenience, two sides are labeled as red and blue. We refer to the red aircraft carrier and UAVs as the home side, and the blue ones as the enemy to construct the dynamics of UAVs in a combat scenario. At the start, the UAVs are already in air and located at random positions near their respective aircraft carriers. The combat environment is shown in Figure 1. Both sides have the same number and the same type of UAVs, which are cost-effective, identical, and easy to make. Each swarm is assumed to have N UAVs. The number of weapons on each UAV is limited to N W . The UAV can attack the enemy within the attack range R a . The UAVs can detect offensive UAVs and aircraft carriers within a limited detection radius (R s ). The detection radius is greater than the attack range, i.e., R s > R a . The Communication Radius of UAVs, used to transmit and receive data with neighbors, is also limited and is given by R c . The aircraft carriers (HPTs) can move at a maximum speed V T max in random directions. The HPTs also move at a constant speed V AC in a fixed direction.
In a UAV swarm vs. swarm ODCP, a UAV has two main objectives, i.e., to destroy the enemy HPT, which is a moving aircraft carrier, in our case, in offense mode, and to save its own carrier by attacking the enemy UAVs in defense mode. The heuristic data for the location of enemy HPT is available at the start of the simulation to give each side an equal opportunity to win. The UAVs can only update their final target location when they do not locate the HPT at its initial position. The mission environment is the sea area, which is a no-fly zone.

Combat Dynamics
In this subsection, the state dynamics governing the UAV in the swarm and HPT movement are developed. During the offence-defence combat, the states change at fast rates. Therefore, to reflect an actual combat situation, the states must be updated in realtime.
Consider UAVs in swarm to have the states Z X , V X I (k) represent position and speed, P X I (k) is the survival probability and W X I (k) corresponds to the weapons of each UAV during the mission, as mentioned in [13]. We also define X ∈ (Blue, Red) such that, if X is Red, then X is Blue, and vice versa. The state equations for the UAV motion are as follows is the acceleration of the I th UAV at k th time step, and ∆t is the step size.
For an active UAV in a combat mission, the position must be updated after every time step, and each UAV will have to operate in a particular mode at a given time. The different modes of operation of I th UAV are given in the equation below where M I (k) corresponds to the mode of operation of UAVs. The UAV cannot fire weapons simultaneously. The minimum time between two consecutive fires is T f . Consider T as the selected target (Enemy UAV or HPT) at time step k for the UAV (I). The state representing the number of weapons carried by UAV (I) at time step k is given by: where F IT (k) represents whether I th UAV has fired a weapon or not at time step k.
The UAV fires a weapon on the enemy when the attack conditions are satisfied, i.e., the number of weapons is not zero, the UAV mode (M) is not −1, the attack superiority S I T over the selected target T is greater than threshold S th and the target is within the attack radius (R a ). It may be observed here that the target T can either be an enemy UAV or the HPT. UAV weapon fire conditions can be expressed in mathematical form as: where L IT is the distance between the I th UAV and the target T . However, if the number of weapons becomes 0, then the UAV (I) does not share the profit data with the neighbors. It will acquire data from neighbors for target selection. The aim is to dodge the enemy UAVs. This helps reduce the enemy fire-power if attacked by an enemy UAV and, if it reaches the enemy HPT, then it will self-destruct by hitting an enemy HPT (B). The survival probability of a given UAV/HPT decreases if an enemy weapon hits it according to the superiority of the enemy UAV. In case the survival probability of UAV is below a particular threshold P th , the UAV mode M is set to −1. This means the UAV has been destroyed and will not remain part of the swarm, and its motion states will seize. The survial probability of I th UAV in respective swarm is given as where J ∈ [1, 2, 3, . . . , N ] represents the enemy UAV of I, C I (k) is the survival probability of the I th UAV during the time step k and K J I (k) is the kill probability of UAV(I) by enemy UAV(J ) from time step k − 1 to k.
where β w is the environmental factor affecting the hit probability of the weapon and K p is the kill probability. The operational flow diagram for a UAV(I) is explained in Figure 2.
The state of the HPT representing its position is updated as follows: where A is an aircraft carrier on the side of UAV (I).
The survival probability state of the HPT is updated according to: where C A (k) is the instantaneous survival probability of HPT (A) from time step k − 1 to k, K J A (k) is the killing probability of the UAV J towards the HPT(A) and is given by: where K pb is the constant for the killing efficiency of a UAV against HPT.  Figure 2. Operational flow diagram of UAV(I) in offence-defence combat situation.

Structure of ODDDM Algorithm
The overall architecture of the decision-making algorithm for swarm versus swarm combat is designed based on the bottom-up strategy. The decision-making problem, in the combat process, mainly includes target allocation decision-making and swarm motion decision-making.

Target Allocation Decision Making
In the decision-making process used to allocate the target, each UAV assesses the situation and calculates its superiority against all detected enemies. The effect of selecting a certain detected enemy on the overall objective of the mission is evaluated in terms of profit. The profit made by attacking the target is calculated based on the superiority of the UAV against all potential targets and the target survival probability. The UAVs then allocate the targets through negotiations based on the distributed estimated auction algorithm (DEAA) with the goal of maximizing the total profit.

Profit Calculation
The superiority of a UAV (I) over its enemy UAV (J ) is generally dependent upon three main factors [13], i.e., distance to the enemy UAV L IJ , line of sight angle from I to its enemy J (β IJ ), and the speed of the two opponents (V X I , V X J ) as represented in Figure 3a. Every factor could have a maximum value of 1, and the final value of the total superiority could be one at maximum. The following expression summarizes the facts The total superiority of I over J is written as: where ω 1 , ω 2 and ω 3 are the weight coefficients, and ω 1 + ω 2 + ω 3 = 1.
The superiority of a UAV (I) over its enemy HPT(B) is generally dependent upon two factors [13], i.e., distance to the enemy UAV L IB and line of sight angle from I to enemy HPT α IB , as represented in Figure 3b. Each factor can have a maximum value of 1, and the final value of the total superiority could be one at maximum. The following expressions summarize the facts and the total superiority for HPT is calculated as: where ω T 1 and ω T 2 are the weight coefficients, and ω T 1 + ω T 2 = 1.
The profit variable for all the UAVs against enemy UAVs are computed online by each UAV.
where P J is determined from Equation (5) and V J is the value of the J th enemy UAV, and 0 ≤ α ≤ 1. The parameter α determines whether the UAV is more likely to attack the enemy over which it has maximum superiority or the enemy with maximum survival probability. Note that α = 1 corresponds to the maximum superiority over the enemy UAV/HPT, whereas α = 0 means that the defined profit is maximum for the target with maximum survival probability.
Similarly, the profit of attacking the enemy base is computed as follows The profit for every UAV in a swarm against all detectable enemy UAVs or HPT is calculated in Equations (14) and (15). Each UAV tries to maximize the total profit of the system, which is given by: subjected to: where x IJ and x IB are the decision variables. x IJ = 1 means the I th UAV of home side is assigned to the J th UAV of the enemy side and x IB = 1 means the I th UAV of the home side is assigned to the Enemy HPT. f is defined as the preference when deciding whether to attack the enemy HPT or defend against the attacking enemy UAVs. A higher value indicates that the UAV will attack the Enemy HPT more, even when it has the enemy UAV in sight. Variations in this parameter affects the final mission outcome.

Neighbor Detection
Let N Il be the maximum number of neighbors that UAV(I) can have and N Im the number of neighbors that satisfy the condition for communication radius R c . If there are more than N Il UAVs in the neighborhood of UAV(I), then it considers the nearest N Il UAVs as its neighbors. However, in a scenario where the number of UAVs in communication radius R c is less than N Il , then all UAVs in R c are taken as neighbors.
where N In is the actual number of neighbors.

Consensus-Based Estimated Target Allocation
The UAV(I ) initially decides on its target based on the profit calculated in Equations (14) and (15). The profit against all the enemy UAVs detected by neighbors is shared with the UAV itself and saved in O. In addition, it saves a vector F I containing all the possible defense/attack options based on the calculated profit. The profit for the enemy UAVs that are not detected by a particular neighbor is set to 0. Since the detected enemy UAVs positions are known, the decision is based on these UAVs and HPT.
The vector F I consists of the decision variables for the possible attack options. The profit of all these attack options is saved in matrix J and each column represents the expected bid vector for neighbors when set against all enemy UAVs. The vector C I will be shared with the UAVs who selected I th UAV as a neighbor. The maximum profit will be computed in an iterative optimization process to optimize the overall profit of the system. As a result, the F I vector will be updated so that its value is 1 for all the possible targets (enemy UAV or HPT). The updated time step of this target allocation method is a multiple of the swarm motion update time step ∆t. This multiplication factor depends upon two parameters, i.e., the computation power available within the UAV to allocate the target and the maximum communication rate of the UAV with its neighbors.
The target allocation process comprises two phases and the two phases are repeated, one after another, unless the profit vector stops updating, as can be seen in Algorithm 1.
Auction Phase: In the auction phase, UAV (I) saves the profit vector O I containing the profit against attacking all the enemy UAVs detected by UAV (I) and its neighbors. Assume N E enemy UAVs are detected. The vector O E has a total number of elements equal to N E + 1 for N E enemy UAVs and 1 HPT. The vector ID I contains information about the target identification. Here, the location of the target estimated by the UAV (I) and its neighbors is used for identification. Additionally, F I , O I , ID I are of the same size N E + 1. Let C IQ be the bid that UAV I places on the UAV G where G is the member of set containing N E -detected enemy UAVs. F I is the vector containing the decision variables for the possibility of an enemy UAV being selected as a target and is updated as follows: where F IG represents the G th element of the vector F I . The last element of F I is set as 1 because HPT can always be an option for a UAV to attack. The target Q * is selected by the UAV according to: The profit of I against the selected target G * is C IG * and the vector element O IG * is updated with C IG * . If Z I is the target selected by the I th UAV, then Z I = G * .

INPUTS:
Neighboring UAVs Profit Vector against enemies Target IDs from neighboring UAVs U AV I profit vector OUTPUTS: Selected target for the UAV INITIALIZE: size(N E + 1) × (N n + 1) to find the bid vector for every UAV. Z = 0 1×(N n +1) ; array of targets 1 × (N n + 1) The auction process is repeated for all selected neighbors in order to estimate the final targets of neighbors.
Consensus Phase: The auction phase data for all the neighboring UAVs is used to generate a consensus, resulting in the target selection for the particular UAV. The consensus is performed among the winning bids to avoid the duplicate allocation of targets among neighbors. This saves computational resources and maximizes profit. For instance, datasharing by a UAV with its neighbor results in a matrix generation that contains the profit vectors respective to enemies nearby. The communication protocol among the UAVs is dynamic and changes with UAVs in the neighborhood. The shared data includes the winning bids vector and the profit matrix O generated in the auction phase for the UAV and its neighbors. The algorithm finds the neighbor with a maximum bid on the target selected for UAV.
If a neighbor has a higher bid than I, then I loses its bid and selects B as the target, where B is the enemy HPT.
The O matrix locates the maximum profit against the complete neighbor list for the G th enemy UAV and updates the value of the bid vector J IG with J Q * G and returns to the auction phase.
The algorithm is repeated until we obtain the finalized result Z I and the selected target for I does not change in n consecutive iterations.

Swarm Motion Decision Making
In the swarm motion decision-making process, the UAVs choose the behavioral rules based on the target allocation results and generate the control input for updating the speed and position. This section is same as in the previous work [13], with minor updates. If the UAV decides to attack the enemy HPT, it follows the offensive behaviors. Contrarily, it adopts a defensive behavior towards an enemy UAV. The decision-making is carried-out in real-time. The states of UAVs and aircraft carriers are also updated according to the decision results in real-time, which makes it more practical for actual combat applications. Besides, the limitations of the computational resources are imposed, making combat scenarios more realistic in real-time. However, the computation becomes more complex and hard to perform.
The motion of each UAV in swarm is derived based on the neighboring UAVs in the swarm. The model of the UAV motion is updated at every time step based on the position, velocity and acceleration. The acceleration is basically dependent on the control inputs to the system, defined as: The forces acting on a UAV are computed using the position of the UAV in the swarm and the desire to attack the target (enemy UAV or HVT). The UAV's desired position derives from the social force component of the control input. The force component is dependent on number of neighbors (swarm UAVs within a particular radius). It is assumed that the number of UAVs that can share data with the neighbors are limited due to the computational requirements imposed by UAV hardware limits. The neighbors of UAV (I) are N In , as given in Equation (18). The neighbor detection process is critical to maintain swarm structure when enormous density changes occur in the swarm due to severe enemy attacks or multi-agent failures.

Control Input Components Based on Behavioral Rules
We rely on the concept of the swarm. Therefore, the agents in the swarm must adhere to the swarm's behavioral rules. The principles related to motion are cohesion, alignment, and separation [27]. The UAVs must not be too far from each other so that they can sustain communication. They must not be too close, in case they collide with each other, and, for the same mission, the UAVs must maintain a uniform speed among their neighbors.
Cohesion: Every UAV generates an attraction force towards its neighbors such that the distance between them is not very large. The attraction force can be described as: Y In ||Y In || (23) where Y In = Y I − Y n is the distance between the I th UAV and neighboring UAV n and g a (||Y In ||) is the function of the attraction force between the I th UAV and its n th neighbor.
Here, we consider a constant function for the attraction force between these UAVs.
Separation: Every UAV generates a command for the repulsion force so that the UAVs do not collide with their nearest neighbors. This repulsion force can be generated through the control input generating the acceleration command, such that the UAV moves away from the neighboring UAVs.
Y In ||Y In || (24) where G r (||Y In ||)) is the function of attraction force between the I th UAV and its n th neighbor.
Here, a combined force component for cohesion and separation can be defined as: We write this combined force as: where a, b and c are the positive constants. a is the fixed attraction component between the neighboring UAVs in the swarm. b is the force component representing the separation force within the UAV and its neighbors. This force of repulsion increases with the decrease in distance between a UAV and its neighbor and vice versa. Alignment: Every UAV tries to match its velocity with its neighbors while remaining within the acceleration constraints defined by the UAV Structure and the control input limitations. Therefore, the alignment force component for the control input of UAV can be defined as: where k v is the controller gain to control the velocity of any UAV in swarm and n i is the total number of neighbors of i th UAV. Ultimately, the desired velocity, which matches the velocities of the neighboring UAVs, will be achieved through the control system.

Mission-Based Components of the Control Input
Offensive approach: The UAV uses a PN guidance approach to move towards the target when enemy HPT is selected as the target. Since the mission objective is to attack enemy HPT, it is the offensive approach, and the desired navigation force that derives the UAV towards the target can be defined as: where Y B is the position of the enemy HPT at time instance, k T is the proportional gain to derive the UAV towards the target. Defensive approach: When the enemy UAV is selected as the target by a UAV (mode 0), PN Guidance is used to move towards the target. This shows the defensive approach of the UAV since it tries to stop the enemy UAV from attacking the home side HPT. The desired navigation force for the I th UAV to attack the J th enemy UAV can be defined as: where k u is the controller gain used to drive the UAV towards the Target UAV. The total control input signal derived for a UAV depends on the UAV's mode of operation and is given by: In mode 1, the swarm UAVs attack the enemy base, and every UAV generates commands against all the social forces defined for the swarm. However, when in mode 0, the UAV attacks the enemy UAV (defensive behavior) and does not consider the force of attraction with the neighboring UAV. In this senario, only the separation component is considered to avoid collision among the neighboring UAVs.

Simulation and Discussion
This section briefly discusses the simulation and results to verify the effectiveness of the proposed ODDDM algorithm for the UAV swarm vs. swarm combat problem under communication constraints. Initially, the environment is developed for a particular test case. Two multi-agent systems are structured for the swarm vs. swarm combat situation, including HPT as moving Aircraft Carriers. The effect that parameter variations in the proposed ODDDM algorithm have on the outcome of the swarm vs. swarm combat is analyzed.
Environmental Conditions and Simulation Parameters: The combat area covers a range of 20 × 10 km 2 with no threat or no-fly zones. The length of the combat field stretches 20 km towards the east while the width is 10 km in the north direction. The UAVs are initially in launched conditions at a given random position w.r.t respective aircraft carriers in 2D environment. i.e, (length, width) = (1, 5) km at t = 0 s. One side is marked as the red side, and the other as the blue side. The attack on the enemy starts simultaneously from both sides in the simulation.
Combat Forces Parameters: Initially, the aircraft carrier of the red side is positioned at (0, −10) km, and that of the blue side is positioned at (0, 10) km. The values of both HPTs are set as 10 and the value of each UAV is set to 1. In addition, the survival probability threshold is set as 0.2. Similarly, the asset value of every UAV on both sides is initially kept at 1, and the survival probability threshold is set at 0.2. Every UAV can carry five weapons, so that it can fire at either air targets or ground targets depending on the decision of the proposed ODDDM algorithm. The minimum time between two consecutive fires is taken as 1 s. The kill probability of the weapons is 0.8 for the UAV and 0.2 for the HPT, and the environmental affect factor is selected as 0.95.
CBAA Parameters: For the combat situation assessment, the parameters are set as ω 1 = 0.3, ω 2 = 0.6 and ω 3 = 0.1. Similarly, the situation assessment parameters for the final target are set as ω 1T = 0.4 and ω 2T = 0.6. The weight coefficient for the profit calculation α = 0.5. The combat superiority threshold of the UAV against each enemy UAV or base is set as 0.9. For the swarm motion, decision-making algorithm parameters are set as N Il = 7. The communication rate with neighbors is set at 1 s. The command update rate of UAV is set at 100 ms.
Brief Discussion: The dynamic combat between the two sides start as soon as they obtain the information about the enemy HPT. There are 20 UAVs on each side, which flew from their respective aircraft carriers. The initial positions of the UAVs at the start of the combat are randomly generated near their respective aircraft carriers. The initial speed is fixed at 80 m/s for all UAVs with random motion direction. The offense/defense preference for both sides is set to f b = 0.1. The trajectories of all the UAVs and HPT during combat are shown in Figure 4. In this particular simulation case, the blue side terminates the enemy UAVs and HPT (Aircraft Carrier) moving north at a speed of 15 m/s. Note that the destroyed HPT (red aircraft) is 5 km away from its initial position depicting the movement of the aircraft carrier. Both the Aircraft Carriers (Red and Blue) move at a speed of 15 m/s towards the north.
In addition, Figure 4 shows that the UAVs move in a smooth and orderly manner towards the enemy base under the influence of the social forces (cohesion, separation and alignment) before they detect the enemy UAVs within their detection range. The mode of operation in this phase for all the UAVs is 1.
As soon as a UAV detects an enemy UAV within its range, it rejects the cohesion force with its neighbors and starts to chase the detected enemy. However, it considers the repulsion/separation force with its neighbors to avoid collision among the swarm UAVs. The UAVs indulge in a dogfight, and eventually most of the Red UAVs are destroyed. A higher value of f b indicates that the UAVs attack the enemy base and a lower value expresses that the enemy UAV is attacked first to defend the home-side aircraft carrier. While defending against the enemy UAV, the operational mode of the UAV is 0. The active UAVs with fighting capabilities throughout the combat duration for both sides are shown in Figure 5. At t = 215 s, the blue side has six active UAVs, and all the Red UAVs are destroyed. The remaining blue UAVs move toward the enemy aircraft carrier to attack.
The remaining Blue UAVs attack the Red Aircraft Carrier at its initial position in mode 1. When the UAVs do not find the enemy aircraft carrier at its initial location, they start to look for the real-time coordinates of the Aircraft Carrier. The UAV movde changes to 2 after locating the red aircraft carrier. The survival probability of both the aircraft carriers during combat are presented in Figure 6.
We present a particular case of UAV-to-UAV Combat to analyze a UAV's survival condition. Here, the 10th Blue UAV that destroys 13th Red UAV at t = 123 is considered. The T0th blue UAV lock and fire red 13th UAV at time t = 122 s, resulting in a reduced survival probability for the Red UAV, to approximately 0.3. Since we have limited weaponfiring rate of 1 weapon/s at the most, the Blue UAV fires a weapon again at 123 s to destroy the Red UAV. The survival probability of the destructed Red UAV goes below the threshold. The Figure 7a, presents the firing of 10th blue UAV and destruction of 13th red UAV. The 10th blue UAV operation modes during the dogfight with 13th red UAV are shown in Figure 7b. Then, the Blue UAV looks for another target according to the attack profit against other Red UAV or the Red aircraft carrier.  As long as the UAVs are not in the detection range of enemy UAVs, they have the heuristic data available for the enemy aircraft carrier location and remain in the Attack Mode (mode = 1); see Figure 7b. As soon as an enemy UAV is detected, the operation mode changes to defense mode (mode = 0), based on the profit against it, through auctio process. After the destruction of the 13th Red UAV, the 10th Blue UAV remains in the Defense Mode, and the 3rd Red UAV is selected as the target.
The speeds of all the UAVs on both sides are presented in Figure 8. As soon as a UAV is destroyed, its speed diminishes. The speeds of all the UAVs involved in combat remain within the speed limits mentioned in Table 1. Control Gains Repeated simulations have been carried out to analyze the statistical data for the Combat. Since the combat scenario and the environment remain the same for both sides, we expect the winning results to be 50:50. However, considering the randomness of the test cases, we observe that the Red Side won 495 times while the Blue Side won 505 times. The results are as expected due to the considered practical constrains, as shown in Figure 9. Swarm Behavioral Analysis: Based on the available literature analysis, the swarm width, i.e., the coverage of the UAV swarms while attacking the enemy, plays a critical role in combat. We analyze the effect of cohesion and separation forces within the swarm, which control the width of the swarm during the attack. We change the values of the repulsion force component and reduce the repulsion force to increase the swarm width. We used two cases: (1) double the value of c in the cohesion/separation equation for Blue UAVs; (2) half the value of c in the equation for Blue UAVs.
It is observed that, in case 1, the average separation distance between the UAVs of the Blue swarm decreased. However, the overall effect of case 1 compared to case 2 is not significanly high. In case 1, the ratio of the Blue Side Win to the Red Side Win is approximately 1.15:1, while, in case 2, the Blue Side to the Red Side Win ratio is approximately 1:4. The results are presented in Figure 10.
Case (1) Case ( The change in parameter f b determines the attack/defense option against the enemy. A higher f b means that the UAV, more effectively, attacks the enemy aircraft carrier, while a small f b value means that the UAV defends against the enemy UAVs. The effect of changes in parameter f b in red and blue UAV swarms is observed, and the success probability of the blue swarm is shown in Figure 11.
The success probability of a UAV swarm also varies with the profit calculation variable α, as discussed in earlier sections. The results of variations in α in both red and blue swarms are shown in Figure 12.

Conclusions
This research paper proposes an offense-defense decision-making algorithm for a UAV swarm vs. swarm combat scenario in the presence of limited computation, communication, and attack resources for the UAVs. The scenario is built based on the heuristic data of the aircraft carrier location and is not updated later, until the carrier is not found at its initial position, since the target can move. Specifically, a consensus-based model is proposed for target allocation, which tries to implement a conflict-free solution for the neighbors to obtain the best possible results. The key feature is that the individual agent in the swarm decides on its guidance parameters while maintaining the swarm characteristics. The decision evaluation in each UAV improves the operational flexibility of the swarm. Proportional navigation (PN) guidance is used to steer each agent towards the target. In addition, UAV's limitations, which can be observed in real-time, were also considered for practical real-time implementations. The internal algorithm parameters used to decide on the profit against the enemy UAVs play a critical role in the final success/failure of the mission. The win probability is at its maximum when the total profit is calculated based on the superiority of a UAV compated to its identified enemies. However, the chances of the win decrease if the UAVs try to directly attack the enemy base, disregarding the defence option. Hence, these pre-combat simulations provide the user with an opportunity to tune the internal tactics of the swarm, i.e., the target allocation method and the swarm behavioral model.