Nomenclature
- DDS
-
data distribution service
- DT
-
Delaunay triangulation
- ENU
-
East-North-Up
- ES
-
electronic speed controller
- FCU
-
flight controller unit
- FLU
-
forward-left-up
- FPG
-
field programmable gate array
- HDL
-
hardware description language
- HLS
-
high-level synthesis
- IMU
-
inertial measurement unit
- IP
-
intellectual property
- KF
-
Kalman filter
- LiDAR
-
light detection and ranging
- MCC
-
maximum correntropy criterion
- PCD
-
point cloud dataset
- PDB
-
power distribution board
- RFID
-
radio frequency identification
- ROS
-
robotic operating system
- RTL
-
register-transfer level
- SDK
-
software development kit
- SoC
-
system-on-chip
- ToF
-
time-of-flight
- UAV
-
unmanned aerial vehicle
- UWB
-
ultra-wideband
Other nomenclature and symbols are defined as they appear.
1.0 Introduction
Unmanned aerial vehicles (UAVs) have significantly expanded research horizons, particularly in aerial navigation [Reference Mohsan, Othman, Li, Alsharif and Khan1, Reference Chi, Zhan, Wang and Zhai2, Reference Kose and Oktay3, Reference Arik4, Reference Oktay, Arik, Turkmen, Uzun and Celik5, Reference Oktay and Köse6, Reference Pritzl, Vrba and Saska7, Reference Telli, Kraa, Himeur, Ouamane, Boumehraz, Atalla and Mansoor8, Reference Uzun and Oktay9, Reference Sandino, Vanegas, Maire, Caccetta, Sanderson and Gonzalez10]. Despite extensive research efforts focused on deploying UAV systems with various navigation methods, especially for the challenging task of aerial indoor navigation, many proposed methodologies predominantly rely on simulations [Reference Elshakhs, Deliparaschos, Charalambous, Oliva and Zolotas11, Reference Dancila and Botez12]. These simulation-based approaches often fall short in meeting real-world deployment requirements.
This work presents a low-cost modular UAV hardware platform specifically designed to facilitate aerial indoor navigation research and system validation. Emphasising modularity, the platform provides a flexible and adaptable foundation for experiments, allowing researchers to customise and scale their setups within reasonable payload constraints to meet the specific demands of indoor environments. The work strongly contributes to the important area of open and explainable unmanned vehicle architectures for accelerating prototyping development and system deployment. One of the core modules is a field programmable gate array (FPGA) unit capable of efficiently computing Delaunay triangulations [Reference Rowell, Dunstan, Parkes, Gil-Fernández, Huertas and Salehi13]. Initially, triangulation calculations are completed using an incremental algorithm, with the C++ code subsequently converted into hardware description language (HDL) through high-level synthesis (HLS). After optimisation in simulation, the module is deployed onto the FPGA. The Delaunay triangulation core plays a crucial role in the UAV’s data processing, enabling efficient real-time surface reconstruction for spatial awareness. Delaunay triangulation being of particular interest in aerospace engineering and navigation application studies [Reference Kallis, Deliparaschos, Moustris, Georgiou and Charalambous14, Reference Hadjiloizou, Deliparaschos, Makridis and Charalambous15]. Experimental results are presented in this paper [Reference Hadjiloizou, Makridis, Charalambous and Deliparaschos16]. While this approach strongly accelerates prototyping and development, it also enables detailed investigation into UAV navigation precision and reliability in indoor spaces. Through rigorous investigation of its architecture, capabilities and potential applications, we demonstrate how the proposed platform advances UAV indoor navigation research development [Reference Deliparaschos, Loizou and Zolotas17, Reference Tech18].
The paper is structured as follows. Section 2 discusses the hardware description for the quadrotor setup. Section 3 explores the overall architecture of the UAV system, detailing the integration and interaction of various components essential for autonomous operation. Section 4 focuses on a key element in the UAV’s data processing, enabling efficient real-time surface reconstruction for spatial awareness: the Delaunay triangulation core. Section 5 presents the results, including key performance metrics, while Section 6 concludes the paper and discusses potential future enhancements.Footnote 1
2.0 Hardwear Description
The finalised quadrotor project integrates a variety of essential components that are crucial to its successful operation [19]. This section offers a detailed overview of the key hardware elements selected for the system, alongside an analysis of the decision-making process underpinning their selection. Each component was carefully assessed based on specific criteria, including performance, compatibility, power efficiency and its overall contribution to the system’s functionality. By examining the rationale behind each choice, this section underscores the considerations and trade-offs involved in constructing a reliable and efficient quadrotor UAV.
2.1 Inertial measurement unit
An inertial measurement unit (IMU) is a device that utilises a combination of accelerometer, gyroscope and magnetometer sensors to determine a body’s orientation, velocity and gravitational forces. Often, it is paired with other sensors, such as vision-based or wireless technologies, to enhance pose estimation accuracy. When other sensors are unavailable, the IMU can act as the primary sensor. The IMU provides measurements of a quadrotor’s orientation through the three Euler angles: roll, pitch and yaw. The Pixhawk PX4 flight control unit (FCU) includes three internal sensor chips, namely the Invensense MPU 6000 as the primary 3-axis accelerometer/gyroscope, the ST Micro L3GD20 3-axis 16-bit gyroscope and the ST Micro LSM303D 3-axis 14-bit accelerometer/magnetometer.
2.2 Flight controller unit
The FCU is responsible for controlling the UAV’s movement and regulating the power delivered to each motor. Flight controllers assess the UAV’s level and speed, using this data to correct its orientation during flight. For this project, the Readytosky Pixhawk PX4 2.4.8 32-bit flight controller was chosen due to its advanced capabilities beyond the standard gyroscope and accelerometer sensors found in most flight controllers. The Pixhawk incorporates a barometer, magnetometer and IMU, enhancing the precision and reliability of flight performance. Furthermore, the Pixhawk is a 32-bit controller equipped with I2C and SPI interfaces and supports both the PX4 and ArduPilot flight stacks.
2.3 Onboard computer options
The quadrotor’s onboard computer, running the robot operating system 2 (ROS 2), manages critical tasks such as processing stereo camera frames for image recognition, handling ground ultra-wideband (UWB) sensor data, analysing point clouds from the light detection and ranging (LiDAR) sensor, integrating measurements from the flight controller, transmitting commands to the flight controller and streaming video to the ground station. To meet the need for a powerful, lightweight and energy-efficient computer, the FZ3 card was selected.
The FZ3 card [Reference Zheng, Zhang, Tan and Li20], featuring an AMD Zynq UltraScale+ ZU3EG MPSoC, is employed for real-time point cloud processing. Leveraging a dedicated Delaunay triangulation core [Reference Hadjiloizou, Makridis, Charalambous and Deliparaschos16], developed by the authors in prior work, within the programmable logic (PL) FPGA portion of the Zynq UltraScale+ ZU3EG MPSoC further enhances its efficiency in processing point cloud data.
2.4 Sensors: camera, LiDAR, ultrasonic sensors
Camera-based landmark detection is a well-established method for robot localisation. This technique employs a camera mounted on the quadrotor and a set of landmarks whose positions are known relative to the global reference frame. By analysing objects detected by the camera, the quadrotor’s position can be estimated. Using a monocular camera, depth information about the landmarks can be determined by processing successive frames. This enables estimation of the relative position between a landmark and the camera. Once the camera’s relative position is known, the quadrotor’s relative position, to which the camera is attached, can be calculated and expressed within the global reference frame.
Initially, we explored using a single basic RGB camera to capture video data for an object detection algorithm that would identify object locations. However, this setup posed challenges in accurately determining object distances. To address this limitation, we selected the ZED 2 stereo camera [Reference Kneip, Tache, Caprari and Siegwart21], which computes distances by analysing pixel differences between two images. The stereo camera connects via USB 3.0 to the FZ3 card. A drawback of this solution is the significant processing power required, particularly when simultaneously running a computationally intensive object detection algorithm.
Alternatively, a LiDAR sensor was considered as a less computationally intensive approach compared to using two RGB cameras. The point cloud generated by the LiDAR sensor allows object recognition while also determining object distances through the reconstructed surface created by the Delaunay triangulation core [Reference Hadjiloizou, Makridis, Charalambous and Deliparaschos16], implemented on the FZ3 accelerator card.
The LiDAR sensor is invaluable for detecting obstacles within the environment [Reference Krejsa and Vechet22]. For our system, we integrated a Hokuyo URG-04LX-UG01 2D LiDAR [23, Reference Kendoul, Lara, Fantoni and Lozano24] onto the UAV. This sensor offers a range of 20mm to 5500mm (with a resolution of 1mm) and a 240
${^ \circ }$
scan window with an angular resolution of 0.36
${^ \circ }$
. Its low power consumption (2.5W) and lightweight design (138g) make it an ideal choice for UAV applications.
For additional range sensing, we selected the Atmel ATMEGA328P microcontroller, mounted on an Arduino Pro Mini board, to manage four laterally mounted HC-SR04 ultrasonic range sensors. This 8-bit, 16 MHz microcontroller is well-suited for driving ultrasonic sensors and transmitting distance data to the processing module via UART or I2C communication. Its compact form factor, lightweight build, low power consumption and user-friendly development environment make it an excellent choice for this application. Moreover, the extensive documentation available for the Arduino platform enhances its integration and reliability in distance measurement tasks.
2.5 Communication modules: Wi-Fi and ultra-wideband (UWB)
Wi-Fi connectivity is essential for enabling communication between the onboard UAV computer and the ground control computer, particularly for the transmission of video data. As the FZ3 card does not include an integrated Wi-Fi module, the AC1300 Archer T3U Mini Wireless MU-MIMO USB adapter was selected. The choice of a USB Wi-Fi adapter, rather than a dedicated Wi-Fi card, was made to simplify system integration, as the latter would require an external antenna for effective operation. The compact USB module incorporates a built-in antenna, thereby reducing hardware complexity and easing installation.
The selected adapter also supports multi-user multiple-input multiple-output (MU-MIMO) technology, which enables two simultaneous data streams. When used in conjunction with a compatible MU-MIMO router, this feature improves overall throughput and network efficiency.
Ultra-wideband (UWB) technology was additionally incorporated to provide accurate localisation capabilities. UWB is a wireless communication technology that employs short, low-energy pulses transmitted over a wide bandwidth, making it inherently robust to multipath interference. In contrast to conventional radio-frequency identification (RFID) systems that operate within narrow frequency bands, UWB spans a broad spectrum, resulting in low power spectral density and reduced interference with other radio-frequency signals.
A key advantage of UWB lies in its precise time-of-flight (ToF) measurement capability, which enables accurate distance estimation and makes the technology particularly well suited for indoor localisation. A typical UWB localisation setup consists of wireless transmitters, referred to as anchors, positioned at fixed locations, and a receiver, known as a tag, mounted on the quadrotor. The tag measures the arrival times of signals transmitted by the anchors to estimate the quadrotor’s position in three-dimensional space. Conversely, if the anchors are arranged in a known geometric configuration and the initial position of the tag is specified, the anchors’ positions may also be inferred.
The UWB system employed in this work is based on the Pozyx platform [Reference Voos25], which utilises the Decawave DWM1000 transceiver.
In addition to Wi-Fi and UWB communication, radio control functionality is provided by a Radiolink R8EF receiver. This 8-channel receiver operates at a frequency of 2.4 GHz and is connected to the quadcopter’s FCU. Specifically, the SBUS output (channel 1) is interfaced with the Pixhawk RC input. The receiver relays control commands generated by the radio transmitter based on pilot stick inputs, thereby enabling precise manual control of the quadcopter when required.
2.6 Power components: battery, DC-to-DC converter
The quadrotor is powered by a lithium polymer (LiPo) battery, connected to a power distribution board (PDB) that distributes power to the various components. The selected battery is a 3S1P LiPo battery (three cells connected in series) with a nominal voltage of 11.1 volts, a capacity of 4400 mAh, a 30C discharge rate, and a mass of 453g. The choice of this battery was based on its balance of output voltage and mass.
To power the FZ3 card from the UAV’s power distribution board, we employed a buck converter. A buck converter, also known as a step-down converter, is a DC-to-DC converter that reduces input voltage while increasing output current. This type of converter falls under the category of switched-mode power supplies.
The FZ3 card requires a 12V DC power supply with a maximum current of two amps. Ensuring adequate power is critical, particularly during computationally intensive tasks such as running object detection algorithms, to prevent a CPU shutdown.
2.7 Propulsion and control system
Four A2212 brushless DC motors (BLDC), rated at 1000 kV (1000 rpm/V)Footnote
2
, were selected for the propulsion system, each exhibiting a maximum efficiency of approximately 80%. The motors are driven by 30 A electronic speed controllers (ESCs) and are equipped with 10
$ \times $
4.5 inch propellers in both clockwise (CW) and counter-clockwise (CCW) configurations. Each motor is capable of producing a maximum thrust of approximately 7.85 N, corresponding to 800 gram-force (gf).
The propulsion system components operate in unison to ensure stable and efficient flight. Despite their name, brushless DC motors are not true DC machines but rather polyphase synchronous motors supplied by electronically commutated currents. In these motors, DC power is converted into alternating currents through electronic switching, generating a rotating magnetic field within the stator. In contrast to brushed DC motors, which rely on mechanical commutation via brushes, BLDC motors achieve commutation electronically, resulting in improved efficiency, reduced wear, and enhanced reliability.
3.0 System Architecture
This section details the UAV system architecture, emphasising the central processing units and sensor modules that enable autonomous operation (navigation, obstacle detection and control). By outlining the architectural framework, this section lays the groundwork for understanding how individual subsystems work together to ensure reliable quadrotor performance.
3.1 UAV overall architecture
The UAV system architecture is designed for efficient, autonomous operation and features several key components. Central to this architecture is the FZ3 card, equipped with a Zynq UltraScale+ ZU3EG MPSoC, which manages high-level computations, decision-making and surface reconstruction implementation on the PL side. The flight controller, specifically the Readytosky Pixhawk, connects to the FZ3 card and is responsible for managing the UAV’s movement and motor control. The system architecture is shown in Fig. 1.

Figure 1. Quadrotor system architecture.
Sensory input is provided by various sensors, including a stereo camera for image recognition, an IMU for orientation data, UWB sensors for localisation and a LiDAR sensor for point cloud data. These sensors collectively provide essential information for navigation and obstacle avoidance. The system is powered by a LiPo battery connected to a PDB, which supplies power to all components. For communication, the AC1300 Archer T3U Mini Wireless MU-MIMO USB adapter enables data exchange with a ground control station. The overall architecture is designed for modularity, offering flexibility to adjust configurations for different scenarios and experiments.
3.2 Coordinate systems for UAV positioning
To accurately determine the position of the quadrotor in space and the relative positions of surrounding objects, two coordinate frames are defined using the standard right-handed robotics convention, as illustrated in Fig. 2. The Earth’s inertial frame
$\left\{ E \right\}$
adopts the East-North-Up (ENU) reference system, where the
$ + x$
axis points east,
$ + y$
points north and
$ + z$
points upwards in accordance with the right-hand rule. The body frame of the quadrotor
$\left\{ B \right\}$
is coincident with the origin and thus represents the quadrotor’s absolute position,
$\left[ {x,y,z} \right]$
. It adheres to the Forward-Left-Up (FLU) convention, which defines forward horizontal movement along the
$ + x$
axis, leftward horizontal movement along the
$ + y$
axis and upward vertical movement along the
$ + z$
axis.

Figure 2. F450 quadrotor model with coordinate frame.
3.3 Quadrotor dynamics
The quadrotor is an inherently unstable, non-linear and complex system comprising four rotors, as indicated by its name. Each rotor includes a propeller and a motor that generates an angular velocity
${\omega _i}$
, producing a thrust force
${f_i}$
, where
$i$
denotes the motor number, as illustrated in Fig. 2. Two rotors rotate clockwise, while the other two rotate counterclockwise, counteracting any undesired rotation of the quadrotor body in the yaw (
$\psi $
) direction due to the conservation of angular momentum.
The angular velocities of the rotors govern specific rotational coordinates,
$\eta = \left( {\phi ,\theta ,\psi } \right) \in {\mathbb{R}^3}$
, which in turn control the quadrotor’s motion in the translational coordinates,
$\xi = \left( {x,y,z} \right) \in {\mathbb{R}^3}$
, relative to the Earth’s inertial frame
$\left\{ E \right\}$
. The quadrotor’s orientation is defined by the Euler angles
$\phi $
,
$\theta $
and
$\psi $
, where
$\phi $
(roll) is the angle about the
$x$
-axis,
$\theta $
(pitch) is the angle about the
$y$
-axis, and
$\psi $
(yaw) is the angle about the
$z$
-axis. The translational coordinates
$x$
,
$y$
and
$z$
represent the centre of mass of the quadrotor with respect to the Earth’s inertial frame.
The quadrotor’s translational equations of motion in the Earth’s frame are derived using the Newton–Euler formalism [Reference Bouabdallah26, Reference Jia, Yu, Mei, Chen, Shen and Ai27], while the rotational equations are derived using the same formalism [Reference de Berg, Cheong, van Kreveld and Overmars28]. These equations assume relatively small quadrotor movement angles:
Here,
$m$
represents the quadrotor’s mass,
${f_T}$
is the total thrust force and
$g$
is the gravitational acceleration. The torques
${\tau _x}$
,
${\tau _y}$
and
${\tau _z}$
correspond to the roll, pitch and yaw torques, respectively.
$J$
denotes the rotor inertia, while
${{{\Omega }}_i}$
represents the angular velocity of the
${i^{th}}$
rotor [29, 30]. Additionally,
${I_x}$
,
${I_y}$
and
${I_z}$
are the moments of inertia of the quadrotor’s symmetric rigid body about its three principal axes.
A simplified linearised model is obtained around the hovering equilibrium point via small-angle approximation (i.e. since we assume relatively low navigation speed, the rotational angles of the system are small also). This approximation leads to
$\dot \phi ,\dot \theta ,\dot \psi \simeq 0$
, and
$s\phi \simeq \phi ,s\theta \simeq \theta ,s\psi \simeq \psi $
, with
$c\phi = 1$
,
$c\theta = 1$
and
$c\psi = 1$
. The system state vector is defined as
${\bf{x}} = {[xyz\phi \theta \psi \dot x\dot y\dot z\dot \phi \dot \theta \dot \psi ]^T}$
. The hovering equilibrium point,
${\bar {\textbf{x}}} = {[\bar x\bar y\bar z000000000]^T}$
is reached when the total thrust corresponds to a constant control input of
${f_T} = mg$
, providing the force necessary to sustain the quadrotor’s hover at an arbitrary position
$\left( {\bar x,\bar y,\bar z} \right)$
. The linearised system derived from this analysis is expressed as:
4.0 Delaunay Triangulation Core Implementation For Surface Reconstruction
Surface reconstruction is a fundamental computational technique with extensive applications across numerous fields. It involves sampling an object’s surface and applying algorithms to reconstruct its structure (e.g. Fig. 3). This work concentrates on surface reconstruction for surfaces representable in the form
$z = f\left( {x,y} \right)$
, commonly referred to as a terrain [31]. Such surfaces are frequently utilised in digital elevation models but also appear in various other disciplines, as they can represent local charts of manifolds – sufficiently small patches that can be appropriately ‘glued’ together to form a larger surface embedded in
${\mathbb{R}^3}$
.

Figure 3. The Stanford Bunny model (left) has been reconstructed (right, comprising 397 points). For improved viewing clarity, the DT figure on the right has been post-processed and does not depict the convex hull of the surface points.
4.1 Delaunay triangulation algorithm
This work utilises the Delaunay triangulation algorithm for surface reconstruction. In any triangulation, a triangle’s circumcircle must not contain any other points from the set within its boundary; the interior of the circumcircle should include only the triangle’s vertices. The incremental algorithm offers a step-by-step approach to constructing Delaunay triangulations. Starting from an initial triangulation, it adds new points sequentially, updating the triangulation with each insertion to maintain the Delaunay property.
A critical step in this method involves subdividing affected triangles when a new point is introduced, followed by edge flipping to ensure compliance with the Delaunay condition. Each point insertion results in localised adjustments, preserving the Delaunay property throughout the triangulation process.
To enable efficient real-time surface reconstruction, the algorithm is implemented on an FPGA via the DT-core (Delaunay triangulation core).
The process commences by projecting all sampled points onto the x-y plane while retaining their original height values. A Delaunay triangulation is subsequently applied to the projected points, segmenting the surface into multiple triangles. The points within the triangulation are then elevated to their recorded heights, creating a piecewise linear representation of the surface using these triangles. An increase in the number of points results in the formation of more triangles, thereby improving the quality of the surface reconstruction.
4.1.1 Incremental algorithm
To reconstruct the surface, the incremental algorithm is employed. This algorithm begins by creating an initial super-triangle that encompasses all points in the dataset. For each new point inserted, the triangle containing the point is identified and subdivided into three new triangles based on the newly added point. Subsequently, the circumcircles of these new triangles are examined to ensure they do not enclose any other points from the dataset. If the Delaunay condition is violated, edge-flipping is performed to restore compliance. This process continues for each new point until all triangles satisfy the Delaunay condition. The algorithm has a time complexity of
$O\left( {n{\textrm{log}}n} \right)$
, where
$n$
is the number of points in the dataset.
The Delaunay triangulation is detailed in Algorithm 4.1.1. In line 1, a triangle is constructed to encompass all the points on the surface being reconstructed. Lines 2–3 involve processing each point from the set
$P$
, iterating over all triangles to find the one containing the current point. If such a triangle is found, it is subdivided into three new triangles in line 5, with their vertices formed by the newly inserted point. The set of triangles is then updated accordingly in line 6. The new triangles are marked as unprocessed for further investigation in line 7.
Lines 8–14 of the algorithm are dedicated to verifying the Delaunay condition for all unprocessed faces. If a face fails to satisfy the condition (line 9), the flipping procedure is applied in line 10 to transform it into a valid Delaunay triangle. The updated faces are marked for further investigation in line 11, while the edges and triangles are updated in lines 12–13. This iterative process continues until all triangles comply with the Delaunay condition. Finally, line 15 outlines the return of
$D\left( P \right)$
, the edges of the completed Delaunay triangulation [Reference Hadjiloizou, Makridis, Charalambous and Deliparaschos16].
Algorithm 1: Incremental Delaunay Triangulation Algorithm

4.2 On the use of two-dimensional delaunay triangulation for mapping and navigation
The Delaunay triangulation (DT) approach adopted in this work operates in a two-dimensional domain, where triangulation is performed using only the planar
$\left( {x,y} \right)$
coordinates. Each triangulation vertex retains its own elevation value, which may vary across the surface and is used to reconstruct surfaces of the form
$z = f\left( {x,y} \right)$
.
This design choice is intentional and well aligned with typical indoor UAV navigation requirements. In many practical scenarios, motion planning and obstacle avoidance are carried out predominantly in the horizontal plane, while altitude is regulated independently by the flight control system. Under these conditions, a two-dimensional triangulated representation provides sufficient geometric information for mapping, free-space estimation and collision avoidance, while significantly reducing computational complexity.
Compared to full three-dimensional Delaunay triangulation, the 2D formulation offers advantages in terms of determinism, memory footprint and suitability for real-time implementation on resource-constrained SoC FPGA platforms. Although full 3D triangulation may be required for volumetric mapping or multi-level environments, such functionality lies beyond the scope of this work. The proposed DT core nevertheless provides a scalable foundation that can be extended towards higher-dimensional triangulation approaches in future research.
4.3 Delaunay triangulation core high-level architecture
The DT algorithm was implemented on an FZ3 accelerator card, which is built around the AMD Zynq UltraScale+ ZU3EG MPSoC. This platform integrates a SoC that combines a processing system (PS) – featuring a 1.2 GHz Quad-Core ARM Cortex-A53 and a 600 MHz Dual-Core Cortex-R5 – with PL, where the DT core was deployed. This architecture enables real-time surface reconstruction.
4.3.1 Hardware platform: FZ3 accelerator card
This work utilised the FZ3 accelerator card as the onboard computer, whose technical specifications are summarised in Table 1.
The onboard computer leverages the hardware capabilities of the AMD Zynq UltraScale+ MPSoC FZ3 accelerator card [Reference Nyboe, Malle and Ebeid32]. This platform is a general-purpose board that integrates a heterogeneous computing system consisting of a PS and PL, which communicate via a 128-bit AMBA AXI4 interconnect.
Processing system (PS)
The PS integrates several heterogeneous processing units:
-
• Cortex-A53 application processing unit (APU): a 64-bit quad-core Arm v8 CPU with an 8-stage, 2-way superscalar in-order pipeline. Each core includes separate 32 KB L1 instruction and data caches, a shared 1 MB L2 cache and a memory management unit (MMU).
-
• Cortex-R5 real-time processing unit (RPU): a 32-bit dual-core processor (Arm v7 architecture) with 32 KB of L1 instruction and data caches per core. Each core also has a local 128 KB tightly coupled memory (TCM). The cores can operate independently or in lockstep mode and support error correction code (ECC), enabling detection of up to two errors and correction of one.
-
• Mali-400 MP2 graphics processing unit (GPU): includes pixel and geometry processors with a 64 KB L2 cache.
Table 1. Features of the FZ3 accelerator card

Programmable logic (PL)
The PL section provides significant reconfigurable computing resources for hardware acceleration and sensor interfacing, as summarised below:
-
• Programmable logic cells: 154 K
-
• Look-up tables (LUTs): 71 K
-
• Flip-flops: 141 K
-
• Distributed RAM: 1.8 Mb
-
• Block RAM: 7.6 Mb
-
• DSP slices: 360
A block-level overview of the SoC architecture and its subsystems is illustrated in Fig. 4.

Figure 4. Block-level overview of the Zynq UltraScale+ SoC architecture and its subsystems. Adapted from Ref. (Reference Nyboe, Malle and Ebeid32).
4.3.2 Implementation in programmable logic
The DT core is implemented on the PL side to enable point cloud surface reconstruction using point cloud datasets (PCD) received from the stereo camera or LiDAR on the UAV. This approach leverages the programmability and performance of FPGA hardware to efficiently process the incoming PCD data for real-time reconstruction.
To develop the DT core, a high-level synthesis (HLS) process is employed. This process begins with a high-level description of the algorithm, written in languages such as C/C++, Python, or MATLAB. The code is then analysed, with architectural constraints and scheduling strategies applied to generate an optimised register-transfer level (RTL) HDL. Subsequently, the RTL is synthesised into gate-level circuits using logic synthesis tools, transforming the high-level algorithm into efficient hardware logic.
For high-level synthesis, Vitis HLS was used as the synthesis tool, packaging the resulting RTL into an intellectual property (IP) core on an AXI4 bus. This process is illustrated in Fig. 5.

Figure 5. DT core implementation on PL side.
Compared to the original software implementation, this design significantly improves surface reconstruction speed, with the IP core demonstrating substantial performance gains over the software-based approach, as shown in Fig. 12 [Reference Hadjiloizou, Makridis, Charalambous and Deliparaschos16].
For efficient point handling, an incremental insertion process was developed, controlled by the ‘tri2D_loop1’ function, which calls the ‘insert_site’ function in each iteration. The ‘insert_site’ function runs in parallel with the loop and manages the insertion of each surface point. The number of iterations in the loop depends on the total number of surface points, directly influencing the loop’s latency cycles. Within the ‘insert_site’ function, the ‘insert_site_loop1’ loop is responsible for locating the legal triangle (stored in
$T$
) that contains the current point. To verify whether the identified triangle is valid, the loop invokes the ‘in_triangle’ function. If the required triangle is difficult to locate, the number of iterations in the loop may increase; however, the maximum number of iterations is governed by the equation
$2n - 2 - k$
, which represents the maximum number of triangles that can be constructed during the Delaunay triangulation based on the number of points
$n$
.
After the legal triangle is found, the ‘insert_site’ function calls the ‘legalise_edge’ function to verify the legality of each new edge. During this step, ‘initialisation_loop2’ initialises the table containing triangles to be checked. The ‘legalise_edge_loop1’ loop is responsible for checking all affected triangles until the verification process is complete. Ultimately, the triangulation results are output as ‘edge_matrix’. Figure 6 illustrates the overall DT-core architecture.

Figure 6. DT core architecture.
4.3.3 Real-time data exchange with ROS 2 in the processing system
In the PS, the surface reconstruction and implementation framework utilises the ROS 2 communication architecture to enable real-time data exchange between system components. This section outlines the ROS 2-based communication system and its role in managing data flow across various modules, including sensors and processing units. This configuration is critical for efficient point cloud processing, which serves as the foundation for the UAV’s environmental perception and navigation capabilities.
The ROS is an open-source middleware extensively used in robotics. However, ROS 1 is not well-suited for real-time embedded systems due to its inability to meet real-time requirements and its limited operating system compatibility. To address these limitations, ROS 1 has been significantly updated to ROS 2, which leverages the data distribution service (DDS). DDS is better suited for real-time distributed embedded systems due to its various transport configurations, such as deadline management, fault tolerance and scalability.
ROS 2 allows a primary programme to be divided into multiple subprocesses that operate concurrently and interact with each other. In ROS 2, nodes represent processes, each assigned a distinct task to enhance overall system efficiency. Communication among these nodes occurs through a series of topics and services. Topics facilitate the transfer of information, known as messages, from one node to another. Nodes publish and/or subscribe to topics to send and/or receive these messages. Services operate on the request/response model: one node offers a service that another node invokes using the service name. The node creating the service is the server, while the nodes calling the service are the clients. Topics are suitable for managing continuous data streams, while services are ideal for blocking calls, making them suitable for remote procedure calls that terminate quickly.
The uXRCE-DDS middleware is employed within PX4 to enable the publication and subscription of uORB messages on the FZ3 card, mimicking ROS 2 topics. This facilitates swift and dependable integration between PX4 and ROS 2, simplifying the process for ROS 2 applications to access UAV information and issue commands. PX4 utilises an XRCE-DDS implementation that takes advantage of eProsima Micro XRCE-DDS [33]. The uXRCE-DDS middleware comprises a client operating on PX4 and an agent functioning on the FZ3 card. These components facilitate bidirectional data exchange through a serial or UDP connection. Acting as a proxy for the client, the agent allows it to both publish and subscribe to topics within the global DDS data space. For PX4 uORB topics to be accessible on the DDS network, the uXRCE-DDS client on PX4 must connect with the micro XRCE-DDS agent on the FZ3 card. The PX4 uxrce_dds_client manages the publication and subscription of designated uORB topics within the global DDS data space. On the FZ3 card, the eProsima micro XRCE-DDS agent serves as a proxy for the client within the DDS/ROS 2 network.
A critical aspect of the system implementation is the interface between the PX4 flight controller and the onboard computer realised on the FZ3 card, enabling the UAV to operate safely indoors. Serial communication is employed for this interface via the TX and RX lines between the onboard computer and the PX4, connected through the TELEM2 port. The PX4 runs firmware version 1.15 to ensure proper interfacing, incorporating the uXRCE-DDS middleware on the PX4 side. The same uXRCE-DDS middleware is executed on the FZ3 card, integrated with the ROS 2 Humble release. To establish this interface, a PetaLinux project was developed, incorporating the hardware implementation, the ROS 2 environmentand a bootable image that includes an Ubuntu root file system.
The communication architecture is based on ROS 2 between the UAV modules (onboard computer, FCU and microcontroller (
$\mu $
C)) and the ground station (computer and RC controller). The IMU, UWB, LiDAR, stereo camera and ESC nodes send (i.e. publish) and receive (i.e. subscribe) data via several topics, while the buzzer and safety switch nodes send and receive data via services. Figure 7 presents the implementation on the PS and the interface with the PL through the AMBA Interconnect/AXI4 protocol [Reference Gatti34].

Figure 7. ROS 2-based communication architecture.
4.4 PetaLinux and Vivado workflow process design
To build the system framework, two distinct processes are followed: the PetaLinux workflow and the Vivado workflow, as illustrated in Fig. 8 [Reference Saravanakumar, Sultan, Shahar, Giernacki, Lukaszewicz, Nowakowski, Holovatyy and Stepien35, Reference Townsend, Jiya, Martinson, Bessarabov and Gouws36]. PetaLinux operates in the PS of the Zynq SoC, enabling the creation of customised operating systems for SoC designs. Vivado, a design suite from AMD for adaptive FPGAs and SoCs, is employed for design entry, synthesis, place-and-route and verification/simulation of HDL designs implemented in the PL. Vivado provides comprehensive tools that facilitate the entire design process, including design entry, synthesis, place-and-route and verification/simulation [Reference Saravanakumar, Sultan, Shahar, Giernacki, Lukaszewicz, Nowakowski, Holovatyy and Stepien35, Reference Bauersfeld and Scaramuzza37].

Figure 8. Framework workflow process.
4.5 Indoor navigation and environment reconstruction
The indoor navigation framework of the proposed UAV platform relies on a sequential sensing-to-mapping pipeline that enables real-time perception and decision-making in GPS-denied environments. The process begins with data acquisition from multiple onboard sensors, including the LiDAR, stereo camera, UWB, IMU and ultrasonic range sensors. The LiDAR and stereo camera provide dense spatial information that is fused into a three-dimensional point-cloud representation of the surrounding environment, while the UWB and IMU modules supply localisation and attitude data to maintain a consistent reference frame.
The generated point cloud is transmitted to the FZ3 accelerator card, where the FPGA-based DT core performs incremental triangulation of the incoming data. Each new set of sampled points is processed in real-time, producing a topologically consistent triangular mesh. This mesh converts the unstructured point cloud into a continuous surface representation, effectively reconstructing the surrounding indoor environment. By updating the triangulation incrementally as new sensor data arrive, the system maintains a dynamic geometric model of the UAV’s immediate surroundings.
The reconstructed surface serves as the foundation for environment modelling. The triangulated mesh delineates free-space and obstacle regions based on geometric features such as height gradients and surface normal vectors. These features are subsequently utilised by the navigation and control modules to perform path planning and obstacle avoidance. The onboard ROS 2 nodes interpret the reconstructed model to determine safe flight corridors, compute feasible trajectories and issue corresponding motion commands to the flight controller.
This continuous loop, spanning from sensor acquisition through Delaunay-based surface reconstruction to navigation control, enables the UAV to achieve accurate, robust and real-time indoor navigation without external positioning systems.
5.0 Results and Analysis
This section presents the findings of the quadrotor project, focusing on key performance metrics. The Delaunay triangulation core simulation and performance subsection demonstrates the simulation and performance results of the implemented core for various PCD sizes and how these compare to the software counterpart implementation. The mass analysis subsection examines the overall mass distribution and its impact on the quadrotor’s flight dynamics. The power estimation subsection evaluates the system’s power consumption, providing insights into the UAV’s efficiency and endurance.
5.1 Delaunay triangulation core simulation and performance
The following FPGA implementation and performance evaluation of the DT core build upon the authors’ previous work [Reference Hadjiloizou, Makridis, Charalambous and Deliparaschos16].
The primary objective of the hardware implementation is performance optimisation, with particular emphasis on reducing latency cycles. Vitis HLS was used to translate the DT algorithm described in Algorithm 4.1.1 into RTL logic, enabling a detailed analysis of the design’s timing behaviour. Following synthesis, the latency associated with each function and loop was identified, and an in-depth analysis of the cycle distribution was conducted. Figure 9 illustrates the main functions and loops contributing to the overall execution latency.

Figure 9. FPGA design module performance: latencies in terms of clock cycles and time (ns).
Among the key functions, tri2d serves as the top-level controller, managing all inputs (such as x_coor and y_coor) and producing the final triangulation output (edge_matrix). As the top-level function, tri2d encapsulates the complete execution latency of the design, and its internal processes therefore have a dominant impact on overall performance.
The triangulation process begins with an initialisation phase implemented in initialisation_loop1, which prepares table
$T$
for storing the set of legal triangles. This is followed by the execution of the triangle_first function, which inserts an initial enclosing triangle that contains all surface points. This operation is performed only once and establishes the initial triangulation domain, after which auxiliary edges associated with the enclosing triangle are removed.
Following initialisation, the tri2D_loop1 iteratively inserts surface points into the triangulation. During each iteration, the insert_site function processes a single point and updates the triangulation accordingly. Within this function, the insert_site_loop1 searches table
$T$
to identify the triangle that contains the current point. This search relies on repeated calls to the in_triangle function and constitutes a significant portion of the overall latency, as its iteration count depends on the number of triangles present in the triangulation.
Once the containing triangle has been identified and subdivided, the legalize_edge function is invoked to restore the Delaunay property. An initialisation step (initialization_loop2) prepares the set of affected triangles, which are then processed by legalize_edge_loop1. This loop iteratively examines and updates neighbouring triangles through edge-flipping operations until all affected edges satisfy the Delaunay criterion. Upon completion of this process for all inserted points, the final triangulation is produced in the form of the edge_matrix output.
More information about the iterations of each loop or the total number of latency cycles for every function and loop of the design are presented in detail in Table 2. Furthermore, the table presents the optimisation with the directives we used in order to achieve more reduction in the clock cycles of the design.
Table 2. Latency analysis of the DT core architecture

Figure 10 shows a waveform snapshot captured from Siemens’ (formerly Mentor Graphics) ModelSim digital simulation tool during the HDL project co-simulation process. The HDL simulator processes simulation requests (e.g. an input PCD file) generated by the C++ testbench. The PCD file, which represents point cloud data captured from the UAV’s stereo camera or LiDAR sensor, provides the input for simulation. The core’s inputs and outputs, which accept an array of PCD points (
$x\_xcoor$
,
$y\_coor$
) and produce the computed Delaunay triangulation edges matrix (
$edge\_matrix$
), are implemented using the Vitis_HLS tool with the
$ap\_memory$
port interface. This interface facilitates communication with RAM blocks, particularly when implementations, such as the 2D Delaunay core, require random access to memory addresses. The control signal
$ap\_start$
ensures computation does not begin until it is asserted (high), while the signals
$ap\_done$
and
$ap\_idle$
indicate the completion of computation and an idle state, respectively. Due to the pipelined nature of the design, the output signal
$ap\_ready$
signifies when the core is prepared to accept new inputs.

Figure 10. HDL model co-simulation.
The hardware implementation was evaluated using a standard benchmark, the Stanford Bunny model, with varying point densities (477, 953, 1906 and 3811 points), as depicted in Fig. 11. The results of the comparison between the software and FPGA core are shown in Fig. 12. The evaluation demonstrated a substantial speed-up over the software implementation, achieving a 20
$ \times $
improvement in execution time. The FPGA implementation successfully managed real-time surface reconstruction, attaining a frame rate of 70 FPS for a point cloud consisting of 477 points.

Figure 11. Stanford Bunny, sampled at different numbers of points.

Figure 12. Comparison between software and FPGA core.
Compared to the original software implementation, this design significantly improves surface reconstruction speed, with the IP core demonstrating substantial performance gains over the software-based approach.
Table 3 illustrates the comparison between the software and hardware implementations, showing both the execution time and the speed-up achieved through hardware acceleration. These results confirm the suitability of the design for real-time applications.
5.2 Mass analysis
The total takeoff mass (
${m_{TO}}$
) of the electric multirotor UAV is represented as [Reference Biczyski, Sehab, Whidborne, Krebs and Luk38]:
Where
${m_0}$
is the mass of the frame, avionics and propulsion system (excluding the battery),
${m_b}$
is the battery mass and
${m_{pl}}$
is the payload mass, including the onboard computer, camera, LiDAR, ultrasonic sensors and UWB tag.
Equation (3) was applied during the design stage to balance component selection, payload capacity and battery sizing, ensuring the total takeoff mass remained within the propulsion system’s thrust limits.
Table 4 summarises the mass distribution of the final configuration. The resulting takeoff mass of the quadcopter is approximately 1.5 kg when equipped with the FZ3 card.
Table 3. Performance results (time in ms)

Table 4. Mass breakdown of the various UAV platform components

5.3 Power estimation
The power consumption of a BLDC motor depends on its electrical input and mechanical output, which are governed by the applied voltage, current, load torque and rotational speed. The electrical input power
${P_e}$
(W) is defined as
while the corresponding mechanical output power
${P_m}$
(W) is given by
where
$T$
is the torque (N m),
$\omega $
is the angular velocity (rad s
${\;^{ - 1}}$
),
$N$
is the motor speed in revolutions per minute (RPM), and the constant
$9.55 = 60/\left( {2\pi } \right)$
converts RPM to rad s
${\;^{ - 1}}$
. The motor efficiency
${\eta _m}$
is therefore expressed as
LiPo batteries are widely used in small UAVs owing to their high specific energy and high discharge capability [Reference Hwang, Cha and Jung39, Reference Liscouët, Pollet, Jézégou, Budinger, Delbecq and Moschetta40]. The available battery energy
${E_b}$
(Wh), accounting for conversion losses and the allowable depth of discharge, is defined as
where
${E_{{\textrm{spec}}}}$
is the battery specific energy (Wh kg
${\;^{ - 1}}$
),
${m_b}$
is the battery mass (kg),
${\eta _b} \in \left( {0,1} \right]$
represents the overall battery efficiency (including wiring, regulator and thermal losses), and
${f_{{\textrm{DOD}}}} \in \left( {0,1} \right]$
is the usable depth-of-discharge (DOD) fraction, typically limited to 0.8 to prolong battery lifespan. The specific energy
$_{{\textrm{spec}}}$
is determined from Equation (10) using the nominal battery capacity
$C$
(Ah) and voltage
$v$
(V) provided by the manufacturer.
The electrical power required for hovering,
${P_{{h_e}}}$
, is estimated using momentum theory [Reference Jin, Hu, Zhao, Wang and Ai41] and a propeller figure of merit
${\eta _p}$
, typically ranging between 0.5 and 0.7:
where
${W_{TO}}$
is the take-off weight (N),
$\rho $
is the air density (kg m
${\;^{ - 3}}$
),
${N_r}$
is the number of rotors, and
${r_{{\textrm{prop}}}}$
is the propeller radius (m). The term
$1/{\eta _m}$
accounts for the combined motor and ESC efficiency, which is assumed to be 0.8 for the configuration considered in this study.
The estimated endurance
$E$
(h) of the UAV is then computed as
This formulation provides a practical endurance estimate that incorporates both electrical and mechanical efficiencies, as well as operational battery limitations. For higher-fidelity predictions, Peukert-type corrections and discharge-rate-dependent efficiency factors may be introduced [Reference Mayoral-Vilches, Neuman, Plancher and Reddi42, Reference Rinaldi, Primatesta and Guglieri43].
Here,
$C$
is the nominal battery capacity,
$v$
is the nominal voltage, and
${m_b}$
denotes the battery mass [Reference Biczyski, Sehab, Whidborne, Krebs and Luk38]. To extend battery lifespan, the depth of discharge is limited to 80% [Reference Mayoral-Vilches, Neuman, Plancher and Reddi42, Reference Sier, Yu, Catalano, Queralta, Zou and Westerlund44].
Using Equations (7), (8) and (9), together with
${W_{TO}}$
from Table 4,
$\rho = 1.225{\textrm{kg}}/{{\textrm{m}}^3}$
,
${r_{{\textrm{prop}}}} = 0.127{\textrm{m}}$
,
${\eta _p} = 0.6$
,
${\eta _m} = 0.8$
, and
an initial hover endurance of approximately 8 minutes is obtained. More accurate endurance predictions could be achieved using Peukert-based battery models; however, such analysis lies beyond the scope of this study.
Table 5 summarises the power consumption of the onboard UAV components that directly affect flight endurance. Ground station elements, such as the RC transmitter, are therefore intentionally excluded. The Pozyx UWB tag is also omitted, as it is powered by a dedicated replaceable CR2450 battery (3 V, 620 mAh), for which the manufacturer reports an autonomy of up to five years. It is worth noting that powering the Pixhawk PX4 FCU via USB significantly reduces power consumption compared to supplying it from the main battery through the ESC, owing to the lower efficiency of the ESC’s voltage regulator.
Table 5. Expected maximum power draw of the UAV components

The complete multirotor UAV configuration is shown in Fig. 13.

Figure 13. Implementation of the multirotor UAV.
6.0 Conclusions and Future Directions
This work presents a bespoke, low-cost and modular hardware platform designed to support research in indoor UAV navigation. Many existing approaches rely heavily on simulations that may not accurately represent real-world conditions. The proposed platform enables flexible and scalable experimentation, allowing researchers to adapt it to various indoor environments while maintaining appropriate payload constraints.
An incremental algorithm was implemented to compute Delaunay triangulations for real-time surface reconstruction and mapping. The onboard computer incorporates an FPGA-based DT core capable of achieving real-time triangulation at approximately 70 frames per second (FPS) for point clouds captured by the UAV’s camera. Evaluation using the Stanford Bunny model with varying point densities (477–3811 points) demonstrated a 20
$ \times $
speed-up compared to the software implementation, confirming the FPGA’s suitability for onboard, real-time processing. The UAV platform has a total mass of about 1.5 kg and an estimated endurance of approximately 8 minutes, illustrating its practicality for short-duration indoor experiments.
The FPGA-based deployment of the 2D Delaunay triangulation method, optimised through HLS based on the authors’ previous work [Reference Hadjiloizou, Makridis, Charalambous and Deliparaschos16], enables efficient and real-time processing essential for UAV-based scanning and obstacle detection. Future research will focus on integrating a compact data structure and bin-sorting technique into the DT core to further enhance memory efficiency, processing speed, and scalability.
The proposed UAV platform aims to advance research through real-world testing in complex indoor environments. Ongoing work builds on the authors’ previous research [Reference Deliparaschos, Loizou and Zolotas17, Reference Tech18], extending it to evaluate the stability and performance of filtering algorithms in realistic navigation scenarios.
Data availability statement
Data supporting the work in this paper can be made available from the corresponding author on reasonable request.
Financial support
The authors declare that no funds, grants or other support were received during the preparation of this manuscript.
Competing interests
The authors declare no competing interest.








