Adversarial scenarios for herding UAVs and counter-swarm techniques

Abstract The present paper aims to design and simulate an adversarial strategy where a swarm of quadrotor UAVs is herding anti-aircraft land vehicles (AALV) that actively oppose the swarm’s objective by potentially taking them down. The main strategy is to block the AALVs’ line of sight to their goal zone (AALVs’ objective), shifting its trajectory so it reaches a kill zone instead (UAVs’ objective). The counter-swarm strategy performed by the AALVs consists of taking down the closest aerial units to the goal zone. As a result, a consensus algorithm is executed by the UAVs in order to assess the communication network and re-group. Consensus is based on the propagation of local observations that converge into a global agreement on a communication graph. Re-grouping is done via positioning around the kill zone vector or preferring an anti-clockwise formation to better close gaps. The adversarial strategy was tested in an empty arena and urban setting, the latter making use of a path-planning procedure that re-routes the AALV trajectory based on its current destination. Simulation results show a maximum UAV mission success rate converging to roughly 80% in the empty arena. When targeted elimination procedures are executed, UAV mission performance drops 5%, making no distinction between re-grouping strategies in the empty arena. The urban setting shows lower performance due to navigation complexity but favors the decision to re-group based on a formation that close gaps rather than positioning around the kill zone vector.


Introduction
Shepherding behavior refers to external agents (shepherds) influencing the movement of another group of agents (flock) [1]. The forces involved in such influence are repulsive ones, so the flock can move away from the shepherds. By controlling the formation of shepherds, so it "encloses" the flock, the repulsive forces at play can ultimately redirect the motion of the flock toward some designated goal. One form of shepherding is herding, which consists of shepherds steering a flock toward a designated region.
In the realm of multi-agent systems, implementing a herding behavior was inspired by the motion of fish, birds and other animals that live in groups. Subsequent advancements in kinematic modeling allowed artificial swarms to conduct herding tasks such as crowd control, which is another definition for shepherds steering a flock toward a goal. In ref. [2], the author implemented a V-formation control algorithm that can be flexible in terms of the number of shepherds required for successfully cornering the flock, due to the increased effectiveness discussed in ref. [1]. It was later found that shepherd movement based only on the agents' center of mass is effective [3], although cooperation and/or synchronization was not considered. Other non-traditional herding methods involve the abstraction of the flock's geometry as a deformable blob [4], which is useful when the flock has a complex motion or grows considerably in size. In general, shepherding methods and their outcomes thrive in simulation environments but real applications are intended for swarm robotic systems [5]. Some other examples found in the literature for shepherding tasks are navigation with unmanned vehicles through challenging terrain, estimation of area scope, guiding birds away from airports, etc. [5].
More complex implementations of artificial shepherding can be found in ref. [6], where reactive models lead the control methodology. Agents respond to stimuli originating from the environment (e.g., speed between herd and flock, formation of the flock, herd size and obstacle density). Ref. [7] considered a more realistic sensing procedure for the flock, where kinematic forces are governed by the ability of estimating a neighborhood density metric. This improves control stability, and it is more appropriate for robotic implementations. In ref. [6], the discovery of a phase transition is introduced, discussed as an environmental state, which depends on obstacle density and flock size, serving as a boundary for a complexity increase in shepherding, useful for analyzing the limitations of swarm controllers. In ref. [8], the added complexity to shepherding behavior consists of the herd collecting rogue flock units back to the formation, as well as leading all of them to the herd's goal zone. Adding noise to the sensed information (actuators only) was done to test a reinforcement learning (RL) procedure. It is thoroughly discussed in refs. [9][10][11] that AI is needed for cognitive approaches to design shepherding swarm controllers. An ontology for shepherding is introduced [10], based on a single architecture with two pillars: decisionmaking autonomy and contextual awareness for all agents. The focus is on relevant sensing and efficient communication, while the controllers rely on distributed AI algorithms.
A potential military application of herding is studied in this paper, where shepherds are quadrotor UAVs (Unmanned Aerial Vehicles) driving anti-aircraft land vehicles (AALVs) toward a kill zone. That is the main objective for the UAVs; however, AALVs are concerned about completing a patrol in a designated area by reaching a goal zone, opposing the decision of the UAVs. Such adversarial scenario creates opportunities for defense strategies adopted by the AALVs, such as counter-swarming via target elimination. The initial UAV response is considered to be the formation's assessment in terms of connectivity. Due to the decentralized nature of UAV controllers, the coordination of motion requires agents to assess their vicinity in order to properly enclose the AALV units on the ground. If one or multiple aerial units are compromised, the swarm must be able to evaluate its structure and reach a decentralized consensus on the current state of the communication topology. Once the consensus is reached, formation changes can be performed so the mission can still be achieved.
This paper focuses on the implementation of an adversarial model for aerial swarms versus land vehicles under different environmental scenarios. The list of contributions is as follows: • The proposal of a 2-D kinematic model for a swarm of UAVs herding a group of land vehicles with counter-swarm capabilities. • The proposal of a consensus algorithm that triggers when land units take down elements of the UAV swarm. The ripple effect on the swarm state can help the group to perform formation changes quickly, benefiting the herding goal. • The study of targeted elimination maneuvers by the land units, considering gaps in the UAV formation in order to calculate which UAV to take down, maximizing the probability of mission failure for the swarm. • The extension of the 2-D herding model to consider path-planning features when the environmental scenario is an urban setting. • The experimental evaluation via simulations, in order to assess the best strategies for both the swarm and land units based on mission success rate.
The rest of the paper is organized as follows. Section 2 comprises a literature review of the different concepts explored in this paper's contributions. Section 3 introduces the 2-D formulation for the dynamics of every agent. Section 4 considers the outcome of a targeted elimination procedure and the way UAVs can assess the state of the network afterward. Section 5 proposes algorithms that implement counter-swarm strategies in order to take down aerial units, hoping that their reduced "vision" can be translated into reaching the goal zone more often. In addition, re-grouping techniques are discussed as a response from the UAVs. Section 6 explains the dynamics of a basic kinematic controller for both UAVs and AALVs, when they both execute their adversarial behavior. Section 7 indicates the simulation setup for the proposed adversarial mechanism with the numerical results. Conclusions are given in Section 8.

Related work
A general assumption regarding communication capabilities of UAVs is a two-way communication link between agents, only established if they are within a circular communication range [12] (also called situated communication [13]). It can also be assumed that the communication channel is reliable; however, simulating packet losses can be beneficial for measuring the convergence speed of the consensus protocol. When modeling a communication topology under a situated communication scenario, a graph can be used to describe wireless links between agents. The entire topology status can then be measured in terms of total connectivity, or whether or not any two agents can communicate to each other in the network. When it comes to consensus-achieving algorithms, there are many approaches found in the literature, all more or less related to multi-agent scenarios that require a homogeneous assessment of a swarm network.
In ref. [14], a minimum spanning tree was derived from a dynamic and decentralized setting in order to detect damaged sections in aerial vehicles' skins (via detection of connectivity disruptions). Similarly, network nodes can be aware of the overall size in terms of node count when the topology is not known a priori and there is no hierarchy in the node structure [15]. Perhaps the biggest advantage of ref. [15] is the fact that the consensus only relies on active communication with neighbors, notion taken into account for the proposed consensus procedure. It was shown in ref. [16] that the consensus speed strictly increases with the number of neighbors per node, effect that is even more pronounced with the presence of environmental noise. Other methods involve the use of a consensus matrix, which is based on the eigenstructure of the network's topology graph [17].
To assess connectivity this way, the Fiedler value can be used, as seen in refs. [13,18,19]. It refers to the second smallest eigenvalue of the Laplacian matrix L = D − A, where D is graph degree matrix and A the adjacency one. When the Fiedler value is greater than zero, the graph is connected. The Fiedler value is used in refs. [13,19] to optimize the robot deployment process, making sure that the motion controller keeps connectivity at all times. Estimation of the Fiedler value determines how the network topology can be reconstructed from local observations. Further applications of connectivity assessment can be found in ref. [13], where a deployment algorithm is developed for robot agents so they can reach tasks at arbitrarily distant locations while satisfying graph connectivity constraints.
However, in ref. [20], such eigenvalue for the Laplacian is estimated in a decentralized and online manner, for the case when the network grows unpredictably. It uses a technique called discrete-time power iteration to reconstruct eigenvalues and their associated eigenvectors, just by sharing local variables among neighboring agents. In order to support the kinematic formulation of agents, the technique is modified to support continuous functions. Nonetheless, the main objective in ref. [20] is to derive only the algebraic connectivity metric (or Fiedler value) from local observations and information sharing, not the actual graph topology. In this paper, the motivation behind estimating a communication network is to support a decision-making process that responds to target-elimination strategies implemented by AALVs on the ground. The Fiedler value can potentially validate the inferred topology, but exchanging information regarding communication links is more valuable for a defense mechanism that aims at specifically controlling the motion of UAVs.
The design of an adversarial strategy that the UAVs can apply when the AALVs are actively trying to escape is considered to be a counter-strategy to the well-studied herding controllers discussed in refs. [1,4,5], where the objective is to guide a group of agents toward a certain area. Some of these controllers are bio-inspired herding mechanisms (e.g., sheepdogs guiding a herd of sheep) [2,3] that can also be applied to swarming agents on both ends, as seen in ref. [21]. In such work, the authors design specific feedback laws to support multiple agents driving a herd of an arbitrary number of members.
In terms of these herd dynamics, it is important that motion and formation is robust enough in order to deal with local flock movement and potential external disruptions. For example, in ref. [22], local laws are used to maintain formation cohesiveness and connectivity when information is limited and node failures are unpredictable. Enhancing cohesion of a swarm by representing it via graph metrics was discussed in ref. [23]. Particle swarm optimization was used to optimize the distance between the flock and the herd's goal zone, subject to cohesion constraints. It was found that neighborhood awareness is crucial when dealing with limited sensing range. In ref. [24], a flight planning procedure is based on learning about radio propagation characteristics in the environment, with coordination that enables passing messages with routing costs. This achieves robust communication links in the topology while also meeting swarm objectives.
Similarly, a communication backbone can be developed in real time by connecting multiple distant target locations in a decentralized manner [25]. Connectivity is preserved at all times in this proposal, by growing a logical tree based on real physical network links. This is often a distributed procedure since applications require discovery of new routes or nodes in order to complete the mission (e.g., when applied to post-disaster transportation applications like in ref. [25], or searching for a target in an unknown environment [26]). A typical follower-leader relationship can be introduced in order to derive chains of robots that also resemble a topological spanning tree [27]. Connectivity can also be maintained by having a swarm follow biological behaviors such as cohesion, separation and alignment, which were slightly modified so it can fit a general force-based kinematic model [28].
The main contribution of ref. [21] is to use potential fields to model the herd dynamics, which is a feature used in this paper's 2-D formulation proposal. Similar work is seen in refs. [12,29] where a UAV defense system is used to intercept and capture an enemy swarm via the use of custom formations. The end goal of this approach is to escort the malicious group of UAVs outside a forbidden zone. Previous methods that rely on maintaining a graph connectivity are used in order to prevent losses in communication, due to the fact that the formation procedure is quite granular, considering phases such as deployment, clustering, formation, interception and escort.
In addition, this work focuses not only on the herding mechanism to guide AALVs toward a kill zone, but also takes into account an adversary situation where the mobile unit has an independent objective it is constantly trying to achieve (reaching the goal zone). The defense mechanism is also dynamic, meaning that the AALV group is able to perform a targeted elimination strategy against the UAVs, reducing their influence range and therefore increasing the chances of escaping, ultimately failing the mission of the swarm. Other adversarial methods found in the literature are scarce but they consider swarmversus-swarm scenarios where UAVs are incapacitated by using numerous strategies such as individual elimination of a target via ad hoc methods, and breaking up a swarm into clusters in order to weaken it [30].
Nonetheless, a basic herding controller is considered, based on ref. [21] and extended to consider the adversarial situation previously described. A simulation setup is implemented, which considers the traversal of an urban setting, with the establishment of path-planning protocols (inspired by refs. [31] and [32]) on top of the proposed adversarial strategies. Path planning can also be considered for herd controllers, directly affecting the implementation of shepherding behavior. In ref. [33], evolutionarybased path planning is used by the herd to encounter the flock and guide it toward the objective. The novelty of this proposal is the application of waypoints, always optimizing sub-goals at every path step while avoiding obstacles. Multiple AI approaches to path planning and motion control can be found in ref. [34]. This survey investigates multiple competences for robot navigation where RL is applied. When it comes to path planning, the overall framework of RL is considered where reactive behaviors implement already established algorithms like Dijkstra or A * . One example for motion control is to have an action space with a back propagation network so the model converges to the stated objectives in terms of traversal with obstacle avoidance.

2-D formulation
A 2-D formulation is considered in order to simplify the kinematics of the herding swarm. Since the adversarial method considers UAVs versus land units, there should be a fixed distance between them for the targeted elimination procedure to be possible. At this stage, it is not considered for an UAV counter-strategy (as a response for targeted elimination procedures) to be the execution of evasion tactics via changing altitude, due to its trivial nature. Instead, network assessment strategies and re-formation outcomes are proposed, hence the need for simplifying the behavior model and removing the Z axis from the formulation.
Based on Fig. 1, consider m UAVs with positions d j ∈ R 2 , where j ∈ {1, . . . , m}, and AALVs with positions s ∈ R 2 . UAVs are controlled such that they drive the AALVs to a desired kill zone, located at z with angle ψ from position s. By taking the case of just one AALV located at s (or the group's center of mass), its trajectory toward the kill zone at speed v can be defined as This works under the assumption that the AALVs will move toward the kill zone in a straight line. However, and depending on the level of autonomy, the vehicle might plan different escape routes and follow different paths. Nonetheless, since the UAVs are following the AALV group closely, their positions are conditioned to s(t) with additional directional parameters that attempt to influence the orientation of the AALVs so it changes toward the kill zone. The positions for every UAV in the formation are calculated with the following equation where r is a fixed distance between the AALVs and any UAV, α j is the angular orientation with respect to ψ and j the angular separation between UAVs, both defined as Notice the parameter β included in the previous equation. This is useful to increase or decrease the separation between UAVs in the swarm formation. In addition, UAV motion contains a noise component in order to simulate wobbling and make the simulation more realistic. It is also a feature that allows modulation of transmission reliability, when the communication range barely covers an agent. The noise component is Perlin 2-D, which takes the form: where δ is a noise modulation factor that controls the deviation from position d i .

Consensus algorithm
Considering a formation of m UAVs, in order to assess connectivity and reconstruct the topology graph, all units detect neighbors by making use of the communication range with radius C. The condition to determine a communication link between two UAV units d i and d j is  Figure 3 shows the algorithm to ping other UAVs in order to retrieve communication links. This procedure is decentralized, meaning it can be executed locally, where the list of current neighbors N is different for every UAV unit. For simulation purposes, the whole set of UAV positions D is available in order to determine neighbors. A realistic ping procedure would retrieve the same results as the geometric alternative shown here. The procedure makes use of Eq. (5) to determine which UAVs are inside of the communication range of local unit d i . A list of neighbors N is then stored locally. Figure 4 shows the broadcast procedure, executed after pinging existing neighbors. This is also a decentralized procedure, so every UAV can transmit the discovered link set L to all neighbors in N i . In order to progressively move toward a consensus, each local copy of the inferred graph topology (G i in Fig. 4) is sent at every broadcast call. The receive method is then called in order to process the graph estimation. Notice that this series of calls happen in a synchronous way, simplifying the actual process  of two-way communication in wireless networks. This decentralized synchronous consensus algorithm is developed so correctness is evaluated rather than actual performance. Figure 5 shows the procedure for individual UAVs to receive existing versions of topology graphs from surrounding neighbors. It simply reconstructs local versions G i by examining already detected links in the received payload L, and adding the ones that are new. The basic idea is that after a series of exchanges, the network would be able to agree on a global topology that represents the actual communication graph (ground truth). Nonetheless, this is also dependent on the communication range C, movement noise and, most importantly, the existence of gaps due to target-elimination strategies executed by AALVs on the ground. The next section takes a look at how this algorithm performs under different scenarios that vary these variables.

Targeted elimination
Targeted elimination refers to the strategy conducted by AALVs that aims at maximizing the opportunity of reaching the goal location, as opposed to the UAVs' mission, which consists of pushing the flock toward the kill zone. This is an addition to the basic competitive scenario that both groups follow, where the two missions are executed simultaneously. To further increase the odds of reaching the goal location, AALVs active their defense mechanisms to take down aerial units so they stop blocking the path toward the goal. While the UAV formation re-groups, the newly formed gap can be an advantage to AALVs, perhaps reaching the goal before the aerial units can apply a countermeasure. Figure 6 shows the targeted elimination procedure in terms of the geometry of the agents. Figure 6(a) identifies the k aerial units closest to the goal zone as targets. The number of targets k is arbitrary and depends mostly on the realworld capabilities of AALVs. After performing the elimination of targets, in Fig. 6(b) UAVs attempt to fix the formation resulting in a gap that the AALVs' center of mass s can follow toward the goal g.
To assess the capabilities of the proposed targeted elimination procedure, a measurement is performed on how the separation angle between aerial units changes when units are lost. From Eq. (3), the difference between two contiguous units is calculated as follows: Closest aerial units to the goal are identified. Successful elimination created a gap in favor of AALVs.
(a) (b) Figure 6. Targeted elimination procedure executed by AALVs. The procedure simply targets those aerial units that are closest to the goal location. In doing so, AALVs expect a gap opening that gives them direct access to the goal. However, the UAV formation might choose to cover the path to the goal when re-grouping, although that might be enough time for the AALVs to escape. Numerical results test this "gap period" and assess if translates to UAV mission failure.
In addition, how the flock coverage is affected by the targeted elimination procedure can also be calculated. Coverage is defined as the angular distance of the flock, from the first to last unit, hence where β is the parameter used to increase coverage via larger angular separation between units. The higher the separation, the more coverage of the flock.

Modified UAV formation procedure
As a response for the targeted elimination strategy performed by the AALVs, the UAV formation always re-groups following an anti-clockwise orientation. This is the default behavior as dictated by Eq. (3), where gaps always appear for the farthest UAV units in terms of the angle α j . To make things more fair, and to also test a different formation strategy, α j can be calculated by splitting UAV units in equal parts from the kill zone orientation vector. Figure 7 shows the strategy for this new "fair" formation. By taking into account the angular positioning equations that conclude with Eq. (3), the coverage equation in (7) for β = 1 is redefined as which is the total angular coverage of the UAV flock. In order to split such region into equal parts, the space is divided among the m UAV units, yielding the angular step defined as The algorithm to calculate UAV positions d j is as follows 1. Calculate the last unit's j = m (i.e., the one with the largest angle from ψ in a counter-clockwise orientation) angular component as according to Eq. (3). 2. Calculate the angular position for unit j as for j = 1 . . . m.
3. Calculate d j according to Eq. (2). Figure 1 shows the basic 2-D formulation for agents in the herding scenario. The adversary strategy that guides both AALVs and UAVs trajectories was also briefly introduced. Figure 8 shows the pseudo-code for the adversarial herding procedure. Such procedure is executed at every time step t, with positional inputs from a previous state (lines 2 and 3). The termination criteria consider reaching either the goal g or kill zone z, via proximity to their respective radii (r g and r z ).

Basic controller for flock and herd
First, the boundaries [θ 1 , θ m ] for the UAV formation are calculated. These correspond to the minimum and maximum angles of d j with respect to s (lines 5 and 6), formally Any point x is outside the UAV formation boundary if the condition is satisfied. Then, if g is outside the UAV formation, the AALV center s moves toward g (lines 7 and 8); otherwise, s moves toward the kill zone z. After s moves to its new position, all UAVs are updated in order to maintain the herding formation and push s toward the kill zone z. To achieve this, each angle α j is updated according to the new orientation of the vector z − s(t) (lines 12 and 13). Figure 9 shows a simulation example for the adversarial algorithm. It shows that the boundary conditions for the UAV formation indeed force the AALVs to move toward the kill zone when reaching the goal is not feasible. In the absence of obstacles, determining a clear line of sight to any destination is   (0), and the trajectory for the entire simulation length is highlighted in blue. Initially, it is oriented toward the goal position g, and as the simulation progresses, the UAV herds s by shifting toward the right, aligning with the vector z − s at every time step. At point p, and due to the execution of the adversarial algorithm, g is blocked by the UAV formation forcing s to move toward the kill zone z.
fairly simple, so a sudden change in direction does not undermine the possibility of reaching the goal or kill zone.
An issue could be the current orientation at the time of changing destination, especially if the distance to any final zone point is fairly short. In such situation, the AALV center depends on its own velocity to determine whether a change in course is feasible or not. Nonetheless, such feature is properly measured in the experiments which directly impacts if the UAV mission is successful or not.

Adversary herding with path planning
When considering a urban setting for the experiments, the arena becomes a 2-D maze based on real map data. Streets and buildings are represented as paths and obstacles, respectively. The same adversarial herding procedures are applied to this setting with some modifications.
First, locations s (AALV center), g (goal zone) and z (kill zone) are placed randomly on valid paths. Depending on adversarial conditions, the two possible destinations (goal or kill zone) are reached via path planning with the A * algorithm. Figure 10 shows the pseudo-code for the adversary herding procedure with A * path planning. At any time, the point x refers to the current direction the AALV group is facing. Depending on the UAV formation, it could be either g or z. There is a chance that the algorithm constantly switches between these two positions, generating a gridlock condition which should be detected and avoided.
In the numerical results section, the likelihood of such event for the urban setting experiments is measured. Once a destination has been determined, the A * algorithm is used to obtain a path toward it. If such trajectory is never disrupted by a change of direction (lines 12 and 13), the AALV center s follows each point p in the path P (lines 7 to 11). The UAV formation is also updated at the same time.
Path planning in the urban setting is used to challenge the land units so it is not so easy for them to reach the goal zone. Since the UAVs are herding from above, they are considered to have a clear line of sight toward the AALVs, always re-grouping and changing their motion based on the position of the AALVs' center of mass s. A similar situation is considered for the targeted elimination procedure in the urban setting. The height of buildings and other structures that serve as obstacles for the AALVs do not interfere with the strategy that chooses which UAV to take down. As mentioned before, the urban setting is a limitation that only applies to land units trying to find a way toward the goal zone g, so it only affects the 2-D components of s and ignores any line of sight obstruction that could arise due to the maze design of the urban setting.

Results
To test the consensus algorithm, it is a requirement to understand the UAV formation in terms of the number of agents and the radius between the herd and the flock. Naturally, the larger the separation is between the center of the flock and the UAVs, the more space will be available between the aerial agents. This directly affects the communication range, especially if it is fixed for a particular number of UAVs and separation radius. In other words, attempts at modifying the formation due to elimination strategies Figure 11. As the number of units increase, the formation is more dense and the pair distance drops. In contrast, an increasing radius or a decrease in number of agents (both resulting from targeted elimination from the ground) tends to increase the pair distance quite sharply, affecting the communication radius if the tolerance is not high. launched from the ground will probably break the topology since gaps are now bigger. Such scenario would probably require much higher tolerances for fixed communication ranges. Figure 11 shows the relationship between the distance between UAVs in the formation and the number of units when the flock separation radius is varied. This result is a direct measurement from the 2-D formulation in order to assess how the angular orientation α j responds to flock-herd radius r and number of UAVs m. The overall observation is that targeted elimination strategies from the ground can cause the formation to lose agents or retreat, decreasing the number of units and increasing the flock-herd radius, respectively.
Such changes can generate a sharp increase in pair distance, meaning that gaps between UAVs increase, in detriment of the communication radius, potentially breaking the topology and any chance at a consensus result. Assuming the communication radius is fixed, a particularly high tolerance is needed in order to respond to any counter-swarm strategy launched from the ground flock of AALVs.

Consensus speed
The synchronous distributed algorithm for reaching consensus on the communication topology is limited by how far UAV units can reach to each other (communication radius). Being a constant parameter, the communication radius is set and it remains the same for the entirety of the simulation. When the topology is compromised, it is the task of the aerial agents to improve consensus by changing the formation.
The consensus speed for different communication radii (C ∈ [0.1, 1]), in a swarm of 20 UAVs is now considered. The flock-herd radius is set to r = 1 and the Perlin noise parameter is δ = 50. The goal is to agree on the number of UAVs which is m = 20 (ground truth). Figure 12 shows the convergence speed (in terms of number of iterations) for different communication radii.
Angle of separation of UAVs versus formation size.
Maximum angular coverage for different formation sizes.
An iteration is conformed by all aerial units completing the ping, broadcast and receive routines and updating their knowledge of the topology graph. It is clear that consensus speed (number of iterations required) drastically improves with the communication range. The more units it covers, the faster a consensus can be reached.
As mentioned before, targeted elimination procedures disrupt the UAV formation, forcing them to close gaps so the communication range, and therefore consensus speed, remains unaffected. It will then be shown how the consensus algorithm performs when units are randomly removed.

Targeted elimination
In order to assess the performance of the targeted elimination procedure, it is important to measure the impact of the UAV herd size to the region of influence that affects the AALV flock. Figure 13(a) shows the separation angle between UAVs changing with the formation size. The overall behavior shows an asymptotic increase of angular separation with the number of UAVs. This is considered to be an advantage for the UAVs, since a targeted elimination strategy will need to take down multiple units (e.g., potentially reducing the formation size to less than 10 units) in order to significantly increase the separation angle and therefore gaps in the formation.
Furthermore, the parameter β is directly proportional to the total angular coverage of the UAV formation. Nonetheless, the asymptotic behavior is also shown in Fig. 13(b), where the coverage region is almost constant for any number of UAVs. The formation would have to take significant damage in order to reduce the coverage region so gaps can appear.

Adversarial experiments
To test the effectiveness of the proposed algorithms, the experimental setup considers both an empty arena and a 2-D maze. These experiments were conducted using the SCRIMMAGE simulator [35] and the arena spans in length from 0 to 20 units in both axes.
Initially, the arena does not consider any particular terrain or obstacles. Later, for the urban setting, real map data are used from downtown Adelaide, Australia (Fig. 14). The AALV group will traverse these environment, using the path-planning algorithm, while the UAVs follow in the air. The base setup is comprised of random positions for the goal g, kill zone z and AALV units (whose center of mass is s(0)).
For the urban setting, s(0) must be valid and occur on streets only. Simulations were repeated 100 times, setting the number of UAVs to m = 20 for the empty arena and m = 10 for the urban setting. The speed was set to v = 0.01, main radius r = 1 (distance between AALVs and UAVs) and both goal zone and kill zone radius to r g = r z = 0.5 (area that determines the mission outcome).  Figure 15(a) shows the probability of a UAV successful mission (i.e., moving flock toward the kill zone) vs time it takes to complete it, for the empty arena scenario. Part (b) shows the same plot but for the urban setting.
A comparison was made between the adversarial execution of the scenarios without the AALVs eliminating targets, and one with the elimination procedure present. In addition, the targeted elimination was performed with two re-grouping strategies by the UAVs, the default one where units arrange themselves according to Eq. (3), and one where the formation changes according to Eq. (11) (i.e., fair positioning).
Results in Fig. 15(a) show that the targeted elimination strategy increases the probability of mission failure to approximately 5%, making no distinction on how the UAVs re-group themselves after one unit is eliminated. Similarly, the urban setting shows a similar trend in terms of mission failure probability, although it seems like the increase is more substantial for the default re-grouping strategy, while the fair positioning is not as effective. This can be explained by the limited "vision" of the UAV formation when dealing with narrow streets in the urban setting. The default strategy seems to indicate an advantage for the AALVs, due to the generation of wider "blind spots" in the UAV formation.

Conclusion
This paper proposed an adversarial algorithm as a response to the problem of herding AALV with a swarm of quadrotor UAVs. As opposed to common herding algorithms found in the literature, the agent being influenced is capable of focusing on its own objective (reaching a goal zone) as well as shooting down agents in the swarm, which opposes the AALV units by pushing it toward a kill zone. Experiments were conducted by considering an empty arena and a urban setting where the AALVs can perform a targeted elimination strategy. Results confirm the predominance of the UAV swarm over the ALLVs when the adversarial objective is to block the AALVs' line of sight to the goal zone.
The success rate drops slightly when the targeted elimination strategy is in effect; however, it is also shown that the A * path-planning algorithm successfully re-routes the trajectory of the AALVs when the swarm is able to block its line of sight to the goal.
The overall effectiveness of the targeted elimination strategy depends on the UAV formation size and parameters that define the unit separation angle and total coverage region. For example, due to the separation angle increasing asymptotically with the UAV formation size, AALVs have to take down multiple units in order to produce significant gaps that can lead them to the goal zone. Similarly, coverage region shows that asymptotic behavior with formation size, being almost constant for more than 10 aerial units, again making it difficult for AALVs to significantly reduce UAV mission performance.
UAV re-grouping strategies were also analyzed, focusing on the way units are positioned in the formation. For the empty arena scenario, the choice of strategy made no difference in mission performance, while the urban setting shows that aerial units balancing around the kill zone vector is a better choice for maintaining small drops in performance after the targeted elimination attempts.
Since the current adversarial strategy only considers an angular positioning action space, future work will focus on making the whole swarm coverage area affect the decisions of the AALV group. Specifically, the way formation gaps and goal zone coverage influence the attempt of the AALVs to "escape" the UAVs.
In addition, to further test the robustness of the proposed adversarial algorithms, the group of AALVs could improve the targeted elimination procedure to actively reduce the number of aerial units, by fragmenting the swarm into separate chunks to further limit their communication capabilities.