1. Introduction
Accurate 3D reconstruction of complex infrastructure remains a challenge in UAV photogrammetry, particularly in the context of digital construction and autonomous inspection, where geospatial science and computer vision converge with the AEC industry. Structures such as bridges, towers, and industrial facilities introduce occlusions, height variations, and complex geometry that worsen visibility and flight planning challenges in UAV photogrammetry. Conventional approaches (e.g., grid or double-grid patterns) often disregard object-specific visibility constraints and generate redundant imagery, suboptimal reconstruction quality, and extended mission durations.
To address these limitations, this work proposes a BIM-aware framework for autonomous UAV trajectory planning and inspection, wherein a minimal camera network is used to generate efficient and feasible flight paths. At the core of the framework is a formal Integer Linear Programming (ILP) formulation [
1] that selects the minimal subset of candidate camera poses satisfying strict coverage constraints while minimizing penalties related to stereo geometry base-to-height ratio (B/H), GSD, and 3D triangulation uncertainty. The selected camera positions are then sequenced using a TSP [
2] solver to generate an efficient UAV trajectory that is collision-checked against the BIM-derived voxel model. To ensure real-world feasibility, the optimized TSP trajectory is automatically partitioned into battery-constrained epochs, each fitting within a single UAV flight cycle, thereby enabling execution within standard endurance limits.
Using Building Information Models (BIM), dense candidate views are simulated over known geometry prior to flight. Since BIM/IFC models are central to digital construction workflows, this integration supports QA/QC processes, progress monitoring, as-built versus as-designed verification, and facilitates digital twin updates. The pipeline computes coverage-aware, geometry-compliant paths entirely offline, without relying on online SLAM [
3,
4] or onboard mapping for planning. Preplanned missions can be executed using standard navigation and localization methods such as Global Navigation Satellite System GNSS, Real-Time Kinematic (RTK), or visual–inertial odometry (VIO).
Unlike uniform or heuristic flight plans, the proposed framework generates autonomous, scene-adaptive trajectories derived directly from the optimized camera network. This coupling of ILP-based camera selection with TSP-based sequencing represents a key contribution, ensuring that UAV paths conform to the geometry and occlusion structure of each site. Crucially, the minimal camera network is not an end goal but rather a foundation for deriving efficient, flight-feasible trajectories that advance autonomous UAV inspection capabilities in infrastructure monitoring.
The framework is evaluated across three representative AEC scenarios: a steel truss bridge, a thermal power plant, and a real-world indoor construction site, demonstrating that compact, geometry-aware networks lead to significant reductions in camera usage and mission time, with only minor trade-offs in reconstruction quality.
This paper’s contributions are as follows:
A formal ILP formulation for geometry-aware viewpoint network selection, ensuring that IFC-modeled components (e.g., columns, facades, structural elements) receive sufficient coverage while balancing photogrammetric quality terms (B/H, GSD, triangulation uncertainty). This provides a foundation for reliable as-built vs. as-designed verification.
A BIM-driven pipeline for preflight trajectory planning, where dense visibility simulations over BIM/IFC geometry generate candidate views. This will ensure that UAV inspections also maintain the digital construction workflow while providing interoperability with existing project models and management systems.
An adaptive trajectory planning method that sequences the optimized viewpoint networks into efficient, structure-aware UAV paths that implement TSP routing and partition the paths into battery-feasible epochs. This means that the planned missions can be made practical on active construction, while also enabling the delivery of auditable, repeatable inspection data that directly supports QA/QC, progress monitoring, and updates for the digital twins.
2. Related Work
Recent work has shown the increasing use of UAV photogrammetry in AEC applications, including progress monitoring, site documentation, and structural assessment [
5,
6,
7]. These studies highlight the growing role of UAV photogrammetry in AEC practice, further motivating the need for optimized, model-aware mission planning, as addressed in this work.
Building on this growing adoption, it is important to recognize that early UAV photogrammetry missions commonly relied on heuristic flight patterns such as grid or double-grid nadir surveys and façade sweeps [
8,
9]. While simple to implement, these methods generate redundant imagery and often suffer from poor stereo geometry and extended mission durations. Later, Xu et al. [
10] integrated “skeletal camera networks” into Structure-from-Motion (SfM) pipelines to improve reconstruction efficiency by pruning redundant connections between viewpoints; however, this approach may reduce angular diversity, remove useful redundancies for robustness, and does not explicitly optimize photogrammetric quality metrics.
Subsequent work on camera network optimization applied sequential filtering [
11,
12] or set-cover/submodular formulations [
13,
14]. These methods offered formal optimization potential but were often computationally intensive and lacked explicit enforcement of photogrammetric metrics, such as the base-to-height ratio (B/H), ground sampling distance (GSD), or triangulation accuracy.
In parallel, next-best-view (NBV) and exploration-based planners [
15,
16,
17] addressed unknown or partially known environments by iteratively selecting new viewpoints during flight. While highly adaptive, they provide no global coverage guarantees and may converge to a local minimum, which limits their reliability for safety-critical inspection of known structures.
More recent approaches leverage digital twin or model-driven planning [
18,
19,
20], where pre-existing models inform candidate viewpoints. Other works have directly coupled UAV inspection with BIM/digital twin workflows for QA/QC and progress monitoring [
20,
21]. While these works show the potential of model-guided inspection, they typically focus on coverage or image utility without integrating rigorous photogrammetric constraints or directly linking to IFC-based BIM workflows.
In contrast, our work combines global ILP-based camera network optimization with TSP-based trajectory planning, directly tied to BIM/IFC geometry. This ensures per-component coverage, photogrammetric quality, and mission feasibility, bridging the gap between minimal network design and practical UAV inspection in digital construction contexts.
To synthesize the discussion,
Table 1 contrasts the main categories of UAV viewpoint and trajectory planning methods. Existing approaches range from heuristic or sequential filtering strategies to more formal submodular and NBV formulations, and more recently, digital-twin–guided planning. However, most either (i) focus on coverage without enforcing photogrammetric constraints, or (ii) treat viewpoint selection and routing as separate problems, without guaranteeing globally optimized and execution-feasible paths. It is important to note that the methods summarized in
Table 1 address exploration-based, heuristic, or model-guided planning and do not operate on BIM/IFC geometry or enforce photogrammetric constraints. Therefore, their objectives and inputs differ fundamentally from our BIM-driven ILP–TSP formulation, making direct quantitative comparison infeasible.
In contrast, our framework uniquely combines ILP-based camera network optimization with TSP-based sequencing, directly tied to BIM/IFC geometry to ensure per-component coverage in digital construction workflows.
Positioning the Present Work
As summarized in
Table 1, existing methods either rely on heuristic filtering or treat viewpoint selection and trajectory planning as separate problems, often without enforcing photogrammetric quality constraints. The proposed framework addresses these gaps by formulating viewpoint selection as a global ILP optimization over BIM-derived candidates, ensuring strict coverage of IFC components while preserving key photogrammetric quality measures. Crucially, the minimal network is not treated as an end in itself but as the foundation for trajectory generation: a TSP-based solver sequences the selected viewpoints into geometry-adaptive UAV paths that conform to scene complexity and are automatically partitioned into battery-constrained flight missions. To the best of our knowledge, this is the first framework to integrate global camera network optimization with trajectory planning in a BIM-aware pipeline. This coupling ensures operational efficiency and reconstruction fidelity while enabling autonomous UAV inspections to feed directly into QA/QC, progress monitoring, and digital twin updates in large-scale AEC and remote sensing contexts.
3. Methodology
This section presents the complete pipeline for autonomous UAV trajectory planning, integrated with BIM-aware, ILP-based camera network optimization. The proposed approach comprises three interconnected steps: (1) BIM-based scene preparation and visibility simulation, (2) camera network optimization via ILP, and (3) trajectory generation based on a Traveling Salesman Problem formulation. A summary of the proposed approach, illustrating the entire process, is provided in
Figure 1. Moreover, the pipeline is designed to operate directly on IFC-based BIMs, ensuring interoperability with construction information systems and enabling inspection results to be integrated into digital twin environments.
3.1. BIM-Based Scene Preparation
The workflow begins with a high-resolution BIM of the target structure, provided in either mesh or IFC format. If the BIM is not already in 3D mesh format, it is first converted, and a dense point cloud is sampled from its surfaces to simulate the as-designed geometry. Subsequently, visibility analysis is performed between the 3D points and a dense set of candidate camera poses. Candidate camera viewpoints are generated around the structure using standard photogrammetric patterns (e.g., double-grid, circular bands, and oblique views). Each candidate pose includes a 6-DOF position and orientation, and its visibility to scene points is evaluated via ray casting to construct a binary visibility matrix. This matrix encodes whether camera i observes point j and serves as the basis for camera selection. Because candidate viewpoints are tied to specific BIM/IFC elements (e.g., facades, columns, beams), coverage can later be evaluated per component, thereby supporting QA/QC and monitoring progress within autonomous digital construction workflows. This preprocessing step is essential for enabling offline mission planning, but does not represent the core contribution of this work.
Figure 1.
BIM-based framework. The candidate viewpoints are simulated from IFC models, optimized with ILP, and sequenced into UAV trajectories via TSP and battery partitioning.
Figure 1.
BIM-based framework. The candidate viewpoints are simulated from IFC models, optimized with ILP, and sequenced into UAV trajectories via TSP and battery partitioning.
3.2. ILP-Based Camera Network Optimization
The central contribution of this paper is an ILP formulation (
Appendix A.1) that selects a minimal yet photogrammetrically valid subset of candidate cameras. Each candidate’s camera
is associated with a binary decision variable
, indicating whether it is selected.
Given candidate camera poses and 3D scene points, the optimization problem selects the smallest camera subset that ensures:
Every scene point is observed by at least cameras (robust reconstruction);
Favorable stereo geometry with optimal base-to-height (B/H) ratios;
Ground sampling distance (GSD) within acceptable thresholds;
Minimal 3D intersection uncertainty across selected views;
Overall sparsity for computational efficiency and mission autonomy.
The complete ILP formulation is explained as follows:
Each candidate camera
is associated with a binary decision variable (Equation (1)), where
represents the decision variables as:
A visibility matrix
(Equation (2)) establishes the visibility between each camera and each point through geometric intersection and field-of-view checks:
With
, the optimization problem is formulated as (Equation (3)):
All ILP problems in this work were solved using MATLAB’s intlinprog solver (Optimization Toolbox, R2024a), which applies branch-and-bound with cutting-plane strategies to guarantee global optimality. This solver was used for all experiments, and no external solvers such as Gurobi or CPLEX were required.
The weight parameters,
regulate the relative contributions of the B/H ratio penalty, GSD uniformity penalty, and triangulation-angle accuracy term, respectively. Their values (
= 0.10,
= 0.10,
= 0.25) were selected to provide a balanced trade-off between minimizing the number of selected cameras and preserving photogrammetric quality. A slightly larger weight is assigned to
to favor stronger multi-view geometry and reduce the risk of unstable intersections. A full sensitivity analysis is provided in
Appendix A.3, confirming that the optimization remains stable under ±50% variation in these weights.
Subject to the coverage and binary constraints (Equations (4) and (5)):
As mentioned,
is the minimum number of cameras required to observe each point (
). The penalty terms introduced in Equation (3) are applied consistently across all experiments using the fixed weights described above (
Table A1).
In summary, the ILP objective in Equation (3) combines sparsity with photogrammetric penalties under the coverage constraint in Equation (4) and binary selection in Equation (5), with decision variables defined in Equation (1) and visibility given by Equation (2).
This formulation yields a globally optimal solution within the defined constraints and visibility model that satisfies coverage requirements while prioritizing geometric strength. The flexibility of the linear penalty terms allows users to prioritize either sparse coverage or higher geometric quality, depending on application requirements.
Figure 2 illustrates the general workflow of ILP-based camera optimization.
The detailed computation of penalty matrices
,
, and
is presented in
Section 3.3.
For large candidate networks, the ILP optimization employs multi-scale strategies to ensure computational tractability (
Appendix A.2). Default parameter values are summarized in
Table A1.
3.3. Quality-Based Penalty Functions
The ILP formulation introduced in the previous
Section 3.2 includes three key penalty matrices:
,
, and
, which, respectively, represent stereo geometry quality, ground sampling distance, and 3D triangulation uncertainty. These penalty functions are designed to discourage camera-point configurations that may compromise the geometric accuracy or visual quality of the final reconstruction. In this section, we describe the computation and interpretation of each penalty function in detail.
Specifically, stereo geometry is quantified via the B/H ratio in Equation (6) with the penalty in Equation (7); distance-based resolution is captured by the GSD substitution in Equation (8) with the penalty in Equation (9); and triangulation uncertainty contributes through Equation (10) to the ILP objective as shown in Equation (3).
3.3.1. Base-to-Height Ratio Penalty
For each scene point
and each pair of candidate cameras
that can see the point
, the base-to-height ratio is:
where
is the coordinate of the camera
is the coordinate of the point
.
For each camera-point pair
, the penalty
is defined as:
where
is a penalty assigned to the camera
for viewing point
with
ratio outside the optimal range
.
,
represent the acceptable interval for the base-to-height ratio.
This penalty term penalizes camera-point pairs whose stereo geometry would be too short-baseline or too wide-baseline, thus promoting optimal photogrammetric geometry.
3.3.2. Ground Sampling Distance Penalty
To ensure that each 3D scene point is imaged at a sufficient spatial resolution, a penalty is introduced based on the GSD as a proxy for image sharpness and detail capture.
For each camera-point pair
, the GSD is approximated as proportional to the Euclidean distance between the camera center and the point:
where
is the position of the camera
, and
is the coordinate of the scene point
.
To discourage views taken from excessively far distances (which reduce resolution), a normalized penalty is applied to camera-point pairs whose imaging distances exceed an acceptable threshold. The GSD-based penalty
is defined as:
where
is the maximum acceptable distance at which high-resolution capture is assumed to still be achievable (e.g., based on sensor specs or mission requirements).
is the penalty assigned to the camera
for viewing point
from beyond
where (
.
This penalty term ensures that camera viewpoints with unfavorable imaging geometry due to distance are penalized in the ILP objective, thereby promoting viewpoints that capture scene points with higher spatial resolution.
3.3.3. 3D Reconstruction Uncertainty Penalty
In addition to penalizing poor stereo geometry (via B/H ratio) and resolution degradation (via GSD), the third penalty term addresses the 3D reconstruction uncertainty of each point based on the geometric strength of multi-view intersection. This penalty favors camera configurations that result in precise triangulation, thereby improving the overall metric accuracy of the reconstructed point cloud.
For a given 3D point visible from a set of cameras, we estimate its intersection uncertainty using least-square multi-view triangulation over subsets of visible cameras. The standard deviation of the estimated point coordinates is computed from the posterior covariance matrix and is used as an indicator of spatial accuracy.
Let be a penalty assigned to the camera viewing point , based on how much the camera contributes to the intersection error when reconstructing the point .
is the combined uncertainty from the 3D intersection at point , and is a user-defined penalty weight for reconstruction uncertainty.
For each visible camera-point pair, the inverse of this uncertainty is used as a penalty proxy (i.e., higher error → higher penalty):
Here, is a large constant penalizing a numerically unstable configuration. For practical purposes, the values of are normalized to the range [0, 1] across all pairs.
Accordingly, this penalty term in Equation (3) is designed to encourage the selection of camera configurations that provide geometrically robust and accurate 3D intersections, especially in scenes with complex topologies or long camera baselines. When combined with the B/H and GSD penalties, the ILP optimization effectively balances coverage, resolution, and spatial accuracy.
3.4. Trajectory Planning
Once the minimal set of camera viewpoints is selected using the ILP model, the next step involves computing an efficient autonomous UAV flight trajectory that visits each viewpoint exactly once, while avoiding collisions and minimizing total travel distance. This path optimization problem is formulated as a Traveling Salesman Problem (TSP), in which each selected camera pose represents a node, and edges are weighted by a cost function that incorporates both Euclidean distance and optional visibility scores.
The trajectory planning pipeline consists of several key steps (
Figure 3):
Input Preparation: Load a dense 3D point cloud of the BIM and extract the candidate camera positions from the ILP-optimized network. Define a voxel size appropriate to the UAV’s operational resolution (e.g., 1 m).
Preprocessing: Create a voxel grid over the scene geometry using the point cloud and specified voxel size. Compute a visibility score for each camera position relative to the scene points.
Cost Matrix Computation: Construct a cost matrix using pairwise Euclidean distances between camera positions. Incorporate visibility scores to slightly favor viewpoints with better scene coverage.
TSP Optimization: Solve the TSP using a nearest-neighbor heuristic or an external solver to determine an initial camera visitation order. Check the resulting trajectory for structural collisions via ray-based sampling and voxel intersection tests.
Trajectory Postprocessing: Smooth the collision-free waypoint sequence using spline interpolation. Parameterize the dynamic trajectory with user-defined velocity and acceleration limits to conform to UAV motion constraints. Interpolate orientation angles (yaw, pitch, roll) across trajectory segments to ensure continuous transitions.
Flight Time Estimation: Compute the total UAV travel distance from the smoothed trajectory. Estimate the flight time by summing segment-wise travel durations and per-waypoint hovering times, adjusted for environmental disturbances (e.g., wind).
The optimization goal is defined in Equation (11) as:
Subject to the constraints in Equation (12) that each waypoint is visited exactly once and the path forms a closed or open tour depending on the UAV mission design:
Here, is the distance between the waypoints and , is a binary decision variable indicating a path segment.
Battery-Constrained Partitioning: Since the optimized trajectory will likely extend beyond the max endurance of a single UAV flight (≈30 min), the final trajectory is automatically partitioned into multiple flight epochs. Each epoch consists of subsets of waypoints and parts of the trajectory that can be executed within a single battery cycle, ensuring complete coverage of waypoints across consecutive flight epochs without violating operational limits. From a digital-construction perspective, although flight-time constraints define epochs, the resulting autonomous mission segments can still be associated with IFC components or construction zones that happen to fall within each flight. This mapping enables inspection data from successive epochs to be aligned with site logistics and phased work packages, supporting structured QA/QC and progress monitoring.
The duration of an epoch is modeled in Equation (13) as the sum of flight time, hover time at waypoints, and a multiplicative wind factor:
where
= length of leg between successive waypoints,
= number of waypoints in the epoch,
= per-waypoint hover time (for image capture),
= wind factor (safety margin).
An epoch is feasible if
, where
is the battery endurance (e.g., 30 min). To add an operational safety margin, we define an effective cap in Equation (14).
with
a small reserve fraction (e.g.,
= 0.1, i.e., 10%).
The full TSP-ordered trajectory is then partitioned greedily by starting from the first waypoint, and the algorithm accumulates flight and hover times until the running total exceeds . At that point, the epoch is closed, and a new epoch begins from the next waypoint. This process continues until all waypoints are assigned.
Formally, let
be the TSP-ordered waypoints and the segment length
for
. Define cumulative time
as follows Equation (15):
where
is the total hover time up to
Epoch
is defined as
such that
and
is the maximum under this condition. This ensures that each epoch corresponds to a physically executable UAV flight bounded by a single battery cycle, while the union of all epochs preserves the full optimized TSP path.
Finally, the full optimized trajectory, including interpolated positions and orientations, is exported for simulation or direct execution on a UAV. The result is a smooth, collision-free, dynamically feasible, and battery-constrained flight plan adapted to the scene geometry. This ensures that UAV inspections deliver structured, component-level data suitable for integration with digital-twin and construction management systems.
Figure 3.
Overview of the UAV trajectory planning pipeline using voxel grid analysis and TSP-based optimization.
Figure 3.
Overview of the UAV trajectory planning pipeline using voxel grid analysis and TSP-based optimization.
3.5. Flight Execution
The optimized camera poses are executed by a rotary-wing UAV through waypoint-based navigation. The UAV autonomously follows a GPS/RTK-referenced trajectory generated by the TSP solver, where each waypoint corresponds to an ILP-selected camera pose. Images are automatically captured when the UAV reaches a predefined proximity (e.g., 0.5 m) to each waypoint, based on onboard GNSS and inertial sensing.
To maintain navigation stability in GNSS-degraded environments (e.g., under bridges or near tall structures), the mission can be executed using VIO or an RTK fallback mode. Prior to flight, the BIM can be used to identify potential signal-degraded areas and inform adjustments to flight parameters (e.g., increasing waypoint spacing, reducing speed, or modifying sensor-fusion strategies). This pre-integration of BIM ensures that autonomous flight execution remains consistent with construction site constraints, including known occlusions and access restrictions.
Mission duration is estimated directly from the optimized trajectory using the same timing model and parameters as those employed in the battery-constrained partitioning stage (i.e., flight time along segments plus per-waypoint hovering, with a wind/safety factor applied. This ensures consistency between the execution-time estimate and the epoch boundaries used to guarantee battery feasibility, enabling fair comparison between the optimized and dense networks in
Section 4.
After data acquisition, the geotagged images from the optimized camera network are processed offline to generate a dense 3D model. Using only ILP-selected images reduces computational cost and redundancy while maintaining completeness. The resulting point cloud is georeferenced using known extrinsics or BIM-based control, yielding an accurate digital twin suitable for inspection, change detection, and other AEC applications.
3.6. Performance Metrics and Evaluation
To quantitatively assess the effectiveness of the optimized camera network, three complementary metrics will be used, which consist of Coverage Adequacy, Redundancy Ratio, and Network Efficiency. Together, these metrics assess how well the framework supports autonomous UAV mission efficiency, reconstruction accuracy, and robustness. In a digital-construction context, they also indicate how well UAV inspection data can be linked to BIM/IFC elements in a repeatable and auditable manner, which is critical for QA/QC and progress monitoring.
After optimization, the selected camera network is assessed using:
Coverage adequacy Equation (16) measures the proportion of scene points that are observed by at least the required minimum number of cameras (
). It is defined as:
where
is the total number of scene points,
indicates the visibility of a point
from camera
, and
is the binary selection variable for the camera
.
A value close to 1 indicates that almost all scene points satisfy the minimum coverage requirement, which is essential for robust and accurate 3D reconstruction. At the construction scale, this means that all targeted components (e.g., floors, facades, or structural members) are sufficiently imaged for later verification against BIMs.
- 2.
Redundancy Ratio
The redundancy ratio Equation (17) quantifies the degree of excess coverage beyond the required minimum, normalized by the total number of camera-point observations:
This metric captures the fraction of all redundant visibility relationships, i.e., representing coverage beyond the minimum required. A moderate redundancy can be desirable for increasing robustness against occlusion or failures, but excessive redundancy indicates inefficiency. In construction practice, balancing redundancy ensures that critical elements are not undersampled, while avoiding unnecessary flights that slow down inspection cycles.
- 3.
Network Efficiency
Network efficiency Equation (18) evaluates the relative reduction in camera usage compared to the initial candidate set:
where
is the total number of candidate cameras and
is the number of cameras selected in the optimized network. A higher value indicates a more compact and efficient solution with fewer cameras retained. For digital construction, this translates into reduced mission time, fewer battery swaps, and lower processing overhead, making UAV-based inspections more practical for integration into routine QA/QC and progress monitoring workflows.
4. Results
To evaluate the effectiveness of the proposed ILP-based camera network optimization, we conducted three experiments using two detailed synthetic 3D models: (1) a steel truss bridge and (2) an industrial facility, and (3) real-world data of indoor construction. For all experiments, the optimization was performed with a minimum coverage requirement of . Each scene was reconstructed using both a full dense camera network and a filtered subset obtained via our ILP formulation. The performance of each configuration was assessed in terms of trajectory length, flight duration, reconstruction accuracy, and camera network quality metrics.
For each experiment, we present side-by-side comparisons of UAV flight paths, 3D reconstructions, and quantitative indicators, including Coverage Adequacy, Redundancy Ratio, and Network Efficiency. The results demonstrate that ILP optimization reduces redundant views and overall mission complexity while preserving acceptable geometric reconstruction quality. These outcomes confirm the potential of autonomous UAV planning frameworks to improve the safety, efficiency, and precision of infrastructure inspection missions. The following sections describe the experimental setup and findings in detail.
Worth mentioning that all ILP experiments were executed on an Intel Core i5-13400F CPU (10 cores, 2.5 GHz) with 64 GB RAM using MATLAB R2024a. Typical solve times were 40–120 s for networks of 1200–3000 candidate views, and below 3 min when hierarchical mode was activated for larger networks.
4.1. First Experiment: Steel Bridge Model
The first experiment is applied to a simulated medium-span steel truss bridge, which is modeled in Blender [
22] and represented through its BIM-derived structural elements. The bridge spans approximately 40 m in length and comprises key BIM components such as IfcBeam (horizontal and inclined steel beams), IfcMember (cross-braced trusses), IfcPlate (deck and under-bridge panels), IfcRailing (guardrails), and IfcElementAssembly (connection joints). These entities collectively form the semantic backbone for digital twinning, allowing inspection results and photogrammetric data to be directly indexed to individual bridge components.
Image acquisition was simulated with an RGB camera mounted to a UAV flying at an average altitude of 12 m. The sensor setup assumed a consumer-grade APS-C camera (22.3 × 14.9 mm, 4752 × 3168 pixels) with a 25 mm focal length. This sensor setup represents a typical UAV payload used in infrastructure inspection. To ensure BIM-aware coverage, the original mission was designed to capture all relevant bridge components from different angles, including:
Longitudinal side strips of oblique views are aligned with the truss members, ensuring full coverage of chords, diagonals, and connection joints.
Top-down nadir strips covering the deck surface, guardrails, and load-bearing slabs.
Under-bridge parallel strips targeting beams, deck undersides, and cross-bracing elements.
Circular end loops capturing perspective-rich obliques of the bridge portals and pier–superstructure connections.
This BIM-guided perspective selection reflects autonomous UAV inspection priorities in structural monitoring to ensure redundancy for load-bearing members, visibility of fatigue-prone joints, and coverage of the deck’s service surfaces. The initial dense network consisted of 1522 candidate camera viewpoints, from which the ILP optimization selected 567 cameras (≈62% reduction) while satisfying the coverage and photogrammetric quality constraints.
The photogrammetric reconstruction yields a high-fidelity mesh and point cloud, which serves as the basis for evaluating accuracy, image reduction, and reconstruction efficiency between the traditional dense network and the BIM-informed, ILP-optimized network.
Figure 4 compares the original preplanned trajectory (left), which encompasses all candidate viewpoints, with the BIM-aware optimized trajectory (right), where ILP-based camera selection is followed by TSP-based sequencing. The results highlight how autonomous BIM-aware planning supports lean yet reliable UAV missions by directly linking inspection imagery to structural components, enabling integration with asset management systems and digital twin workflows.
The ILP optimization yielded a more compact camera set with minor accuracy trade-offs (
Table 2).
The optimized TSP trajectory for the steel bridge experiment produced a total flight distance of 2203 m. At a cruise speed of 2 m/s, with 2 s of hover per waypoint and a 5% wind factor, the mission duration was estimated at 39 min 12 s, exceeding the endurance of a single battery.
To ensure feasibility, the trajectory was partitioned into two epochs under the 27 min battery cap (
Table 3). Epoch 1 covers most viewpoints with a longer travel distance, while Epoch 2 completes the remaining coverage within a short 11 min flight.
Figure 5 shows the UAV trajectory for the bridge case before and after battery life constraint partitioning.
4.2. Second Experiment: Industrial Facility
The 3D object used in this experiment represents a realistic industrial-scale scenario, modeled to simulate a thermal power plant and provided in obj format as a purely geometric mesh. Although the source model does not contain semantic BIM information, it is treated as a BIM representation for this experiment. This assumption allows UAV viewpoints to be associated with functional building components and mapped to corresponding IFC classes, enabling BIM-aware flight planning and digital twin integration. The structure comprises the following BIM-assumed elements:
IfcCoolingTower: A large cylindrical structure with a concave profile, representative of real evaporative cooling systems.
IfcChimney: A tall, reinforced exhaust stack acting as the primary vertical landmark, prioritized for structural health monitoring and UAV inspection.
IfcBuilding/IfcBuildingStorey: Block-shaped units of varying heights representing operational and maintenance buildings, subdivided into walls (IfcWall), slabs (IfcSlab), and access elements (IfcDoor, IfcWindow).
IfcDistributionElement: Extensive pipework, ducts, and scaffolding surrounding the cooling tower, introducing significant occlusion and accessibility challenges for UAV-based photogrammetry.
IfcSlab/IfcSite: The rectangular base platform, encompassing access areas, rooftop annexes, and flat operational surfaces.
The initial UAV imaging network was designed to ensure coverage of these BIM components through complementary flight trajectories:
- -
Circular or spiral passes around the cooling tower (IfcCoolingTower) and chimney (IfcChimney) to achieve dense, multi-angle coverage of these critical vertical structures.
- -
Horizontal oblique strips around the rectangular building blocks (IfcBuilding) to capture facades and rooftop slabs.
- -
Close-range passes along scaffolding and pipework (IfcDistributionElement), designed to mitigate occlusions and ensure redundancy in visually complex areas.
- -
Low altitude sweeps over the base platform (IfcSlab/IfcSite) to capture rooftop annexes, access points, and flat operational surfaces.
- -
Curvilinear transitions between circular trajectories to maintain smooth motion continuity.
This model is ideal for evaluating UAV flight planning strategies, especially around tall vertical structures and intricate facades. It enables realistic testing of spiral or orbital UAV trajectories, obstacle-aware navigation, and photogrammetric reconstruction performance in complex industrial scenes. The object was imported into Blender and used as the primary 3D scene for simulating UAV imaging, trajectory generation, and pose evaluation in a controlled environment. As in the first experiment, the simulation used a sensor (22.3 × 14.9 mm) with a resolution of 4752 × 3168 pixels and a 25 mm focal length to emulate a realistic UAV imaging setup. The initial dense network consisted of 1263 candidate camera viewpoints, from which the ILP optimization selected 606 cameras (≈52% reduction) while satisfying the coverage and photogrammetric quality constraints. The visual comparison of UAV trajectories and resulting reconstructions for the full dense network and the ILP-optimized network is shown in
Figure 6.
Table 4 summarizes that the reconstruction accuracy and network efficiency metrics for the second test scene are comparable. As with the first test, the ILP-optimized network produces a more compact camera set with only minor reductions in accuracy.
The optimized TSP trajectory for the industrial plant dataset produced a total of 606 waypoints. The estimated mission duration was well above a single UAV battery limit. To ensure feasible flight execution, the trajectory was partitioned into two epochs (
Table 5). Epoch 1 consumed the full 27 min cap, while Epoch 2 completed the remaining coverage in just under 9 min.
Figure 7 shows the UAV trajectory before and after battery life constraint partitioning.
4.3. Third Experiment: Real-World Indoor Construction Site
The third real-world UAV experiment was conducted at the ITC Faculty building of the University of Twente in the Netherlands. The hall-type structure is roughly 220 m long and 50 m wide. At the time of data capture (autumn 2021), the concrete and steel framework was in place, exposing the primary load-bearing columns, which were fully visible and accessible for data acquisition.
The UAV platform used is a DJI Phantom 4 Pro v2, equipped with a 1-inch CMOS, 20 MP RGB camera (84° FOV, 24 mm equivalent focal length, mechanical shutter: 8-1/2000 s; maximum flight time ≈ 30 min). In this experiment, the actual calibrated focal length was 10.26 mm with a pixel size of 2.4 (corresponding sensor size ≈ 13.2 × 8.8 mm). The flight mission was executed in September 2021 with a GSD of approximately 1.5 cm and 90% front/side overlap to ensure high photogrammetric redundancy. Retro-targets were affixed to interior columns; their coordinates were surveyed and used as GCPs for photo alignment, producing a locally oriented model tied to the BIM coordinate frame. A total of six retro-reflective GCPs were mounted on structural columns evenly distributed along the hall. Their 3D coordinates were measured using a total station to achieve an accuracy of approximately 2–3 mm. During photogrammetric processing, the GCPs were manually identified and marked in Agisoft (Version 2.2.2) Metashape to ensure subpixel accuracy. These GCPs were used both for absolute model alignment and to constrain the bundle adjustment to allow for a quantifiable assessment of geometric accuracy.
The as-planned BIM, supplied by the University, was delivered in IFC format and contains only structural members. To use the BIM for the initial flight planning, the model was imported into Autodesk Revit for querying. Structural columns filtered and exported as STL meshes [
23]. Then, a uniform point sampling converted the meshes to a reference point cloud as shown in
Figure 8. Ground-floor columns whose bases dipped below 0 m (foundation level) were removed to match the UAV data extent.
Only the structural columns from the BIM were used in the ILP optimization. This decision was driven by the inspection objective: the columns are the primary load-bearing elements of the hall structure and, therefore, critical for geometric documentation and monitoring. By restricting the candidate visibility simulation to column surfaces, the optimization focused on ensuring complete and high-quality coverage of these essential components, while avoiding unnecessary viewpoints on non-structural elements. This approach demonstrates how UAV flight planning can be tailored to specific construction priorities to focus data capture on critical load-bearing structures while minimizing redundant or irrelevant data acquisition.
The initial dense network comprised 3025 candidate camera viewpoints, from which the ILP optimization retained 2078 cameras (≈31% reduction) while meeting the coverage requirement and photogrammetric quality constraints. The visual comparison of UAV trajectories and resulting reconstructions for the full dense network and the ILP-optimized network is shown in
Figure 9.
Table 6 summarizes the reconstruction accuracy and network efficiency metrics for the second test scene. Like the first test, the ILP-optimized network yields a more compact camera set with minor accuracy trade-offs.
Since the optimized UAV trajectory exceeded the maximum endurance of ≈30 min per battery, the TSP path was divided into four epochs that fit within one battery cycle (
Figure 10). Each epoch, as shown in
Table 7, contains a subset of waypoints, with flight time and distance tailored to remain below the 27 min operational cap (including hovering and wind safety margin).
Although Epoch 3 covers the longest path length (914 m), its duration remains within the 27 min cap because flight time is relatively short compared to the cumulative hover time at each waypoint, which dominates the mission duration.
This real-world validation demonstrates the framework’s readiness for practical deployment in infrastructure inspection and monitoring, supporting repeatable, safety-aware UAV operations in constrained environments.
4.4. Parameter Sensitivity Analysis
To assess the influence of the weight parameters (
) in the ILP objective function, we conducted a parameter sensitivity analysis on the same dataset used in
Section 4.1. Starting from the reference configuration (
) = (0.10, 0.10, 0.25), each parameter was varied independently by ±50%, while keeping the other two fixed at their reference values. This resulted in nine ILP runs in total:
∈ {0.05, 0.10, 0.15},
∈ {0.05, 0.10, 0.15}, and
∈ {0.125, 0.25, 0.375}.
For each run, we recorded the number of selected cameras, the minimum and mean number of rays per 3D point, and the fraction of points that meet the coverage requirement (at least
C = 5 views). The results, summarized in
Table A2, indicate that the ILP solution is robust with respect to moderate changes in the weight parameters. Across all tested configurations, the number of selected cameras varied only slightly, and the mean coverage remained high. The fraction of points satisfying the coverage threshold remained high across all runs.
The most notable effect is a small trade-off between the accuracy weight and the number of selected cameras: higher values slightly increased the number of selected cameras, as the optimizer favored configurations with improved intersection geometry. However, the resulting networks remained close to the reference solution in both size and coverage.
5. Discussion
The results from all three experiments highlight the effectiveness of the proposed ILP-based camera network optimization in balancing data efficiency and reconstruction quality. Across bridge, industrial facility, and indoor construction site cases, the ILP approach consistently achieved substantial reductions in both average cameras per point and overall network size, ranging from approximately 31% to nearly 63% while maintaining high coverage adequacy and acceptable geometric accuracy. In all experiments, the TSP-generated trajectories were automatically partitioned into battery-constrained epochs to ensure mission feasibility within UAV endurance limits. This partition preserved the efficiency gains of the ILP optimization while producing autonomous, executable flight plans. The method ensures that flight planning remains tightly coupled with BIM/IFC geometry, allowing inspection outputs to be indexed by structural component and seamlessly reused within digital construction and remote-sensing workflows.
5.1. Trade-Offs and Reconstruction Quality
As shown in
Table 2,
Table 3,
Table 4,
Table 5 and
Table 6, the ILP optimization consistently reduced camera count and redundancy across all experiments, while maintaining high coverage adequacy and only marginally increasing reconstruction error. These deviations, which were at the millimeter level for the bridge and indoor site and at the centimeter level for the industrial facility, remain well within practical tolerances for UAV-based inspection. These results illustrate a deliberate trade-off: leaner image sets and shorter missions in exchange for a slight loss in accuracy, which is acceptable in most as-built verification and monitoring contexts where mission autonomy and operational efficiency are prioritized.
In addition to the global trends, the performance gains exhibited scenario-dependent behavior, which reflects the influence of geometric layout, occlusion patterns, and structural complexity. The bridge experiment benefited from its highly regular configuration, parallel girders, repetitive truss elements, and predictable visibility corridors. These characteristics enabled substantial pruning of redundant viewpoints without degrading reconstruction quality. In contrast, the industrial facility contained dense and irregular equipment layouts, occluding pipes, and non-uniform façade structures. These factors required retaining more viewpoints to satisfy multi-view constraints and intersection-angle quality, resulting in a smaller reduction ratio and slightly higher residual errors. The third experiment of the indoor construction hall showed an intermediate pattern, including repetitive column spacing and mostly unobstructed lines of sight allowed moderate viewpoint reduction while maintaining strong geometric consistency.
These observations indicate that the ILP-based optimization does not produce linear or uniform improvements across all environments. Instead, its effectiveness adapts to the inherent visibility structure of each site. This scenario-dependent response is expected in real-world inspection settings and demonstrates the robustness of the proposed method. Accordingly, the ILP maintains required photogrammetric quality where necessary, while reducing redundancy and flight time where scene geometry allows.
5.2. Mission Efficiency and Practical Value
The shorter UAV trajectories and reduced mission durations, shown in
Figure 5,
Figure 7 and
Figure 9 (and
Figure 6,
Figure 8 and
Figure 10 for battery-constrained partitioning), illustrate the operational benefits of ILP-based planning. For example, in the bridge experiment, the optimized trajectory reduced flight distance by over 790 m and saved more than 40 min of mission time. In the industrial facility, flight time was reduced from 64 to 37 min, and in the indoor construction case, from 144 to 104 min. These reductions represent a significant improvement in safety, autonomy, and energy efficiency for UAV inspection.
To ensure feasibility, the optimized TSP trajectories were further partitioned into battery-constrained epochs, each fitting within a single UAV flight cycle (approximately 20 to 27 min). Across the three experiments, this resulted in two to four epochs per mission, providing complete coverage while respecting endurance limits. This automatic partitioning preserved the global efficiency of the ILP–TSP optimization and produced flight plans that were directly executable in practice.
The optimized camera networks also yielded cleaner TSP-based paths, minimized hover-and-capture actions, and reduced both energy consumption and mechanical wear. In construction contexts, this translated into fewer battery swaps, shorter on-site inspection windows, and reduced disruption to other site activities, making autonomous UAV-based inspections easier to integrate into daily project workflows. By incorporating photogrammetric quality metrics, such as B/H ratios, GSD, and triangulation uncertainty, as soft penalties in the ILP objective, the framework enforced both geometric soundness and resource-efficient planning.
The slight increase in reconstruction error introduced by the ILP-optimized networks (millimeter-level for the bridge and indoor case; centimeter-level for the industrial facility) remains within acceptable tolerances for most digital-twin update workflows and general construction documentation [
24,
25]. In high-accuracy tasks, such as deformation monitoring or fabrication tolerance checks, there is a need to tune the ILP framework penalty weights to allow the user to enforce stricter photogrammetric geometry (GSD, B/H, and intersection-angle constraints). Increasing these weights leads to denser camera networks and lower errors to enable practitioners to select the desired balance between flight efficiency and reconstruction precision, depending on project requirements.
5.3. Limitations
The current framework assumes an idealized execution environment and does not explicitly model environmental disturbances such as wind, GNSS signal loss, or temporary occlusions. These factors can affect the UAV’s ability to reach planned viewpoints or maintain correct camera orientation, particularly in outdoor settings. Therefore, further validation in complex real-world and GNSS-challenged environments is required to assess robustness under real-world conditions.
The method also assumes a uniform coverage requirement for all 3D points, which may not be suitable for applications where certain areas demand denser sampling due to structural criticality or inspection standards. A logical extension would be to incorporate IFC-based component weighting, ensuring that critical elements such as load-bearing columns, joints, or façades receive higher guaranteed coverage.
Collision checking is conducted using ray-based tests on a BIM-derived voxel model. Consequently, safety guarantees are model-relative and may not account for unmodeled obstacles or dynamic scene changes such as scaffolding, vehicles, or personnel. Operational safety thus still depends on on-site procedures and, when necessary, onboard sensing.
Additionally, operational logistics between battery-constrained epochs, such as takeoff and landing, battery replacements, waypoint reinitialization, RTK or communication relock, and crew repositioning, are not modeled. These factors impose non-negligible timing and safety constraints that can influence feasible epoch boundaries. Permissions for launch and landing, availability of safe staging areas, and coordination with other site activities also remain outside the current planning scope.
A limitation of the current framework is that the ILP–TSP planning assumes nominal flight conditions and does not explicitly model disturbances such as wind, GNSS degradation, or electromagnetic interference. These factors can influence trajectory tracking, image sharpness, effective GSD, and the stability of the planned viewing geometry. Although the timing model incorporates a wind/safety factor, and GNSS-degraded areas can fall back on RTK or VIO, the optimization itself remains environment-agnostic.
Another limitation lies in the assumption that the BIM or IFC model perfectly reflects the as-built geometry. In practice, discrepancies can arise due to construction progress, temporary structures, missing components, or deviations from the design model. Such inconsistencies may lead to small errors in visibility estimation, inflated or deflated occlusion volumes, and local inaccuracies during the collision-checking phase. These effects could slightly alter the coverage matrix or the feasibility of the computed path, particularly in cluttered indoor environments.
5.4. Behavior and Strengths of the ILP Optimization
The ILP formulation ensured globally optimal camera selection within minutes, even for networks with thousands of candidates. Unlike heuristic filtering, it guarantees per-point coverage while balancing photogrammetric quality penalties such as B/H, GSD, and triangulation uncertainty. By adjusting penalty weights, operators can flexibly prioritize either coverage density or geometric accuracy. This globally optimal and tunable optimization behavior contributes to the framework’s robustness across the three diverse test cases and demonstrates its suitability for autonomous, large-scale UAV inspection and monitoring tasks.
6. Conclusions
This study introduced a BIM-aware and autonomous UAV framework for optimal camera-network selection and trajectory planning in photogrammetric infrastructure inspection. By jointly optimizing camera placement with respect to coverage, stereo geometry, ground-sampling distance (GSD), and triangulation uncertainty, the method generates compact camera configurations that preserve high reconstruction quality. Integration with a TSP-based sequencing strategy enables efficient, collision-checked trajectories adapted to scene geometry and constrained by the BIM. Because the entire pipeline is driven by BIM and IFC data, inspection outputs can be indexed to specific structural components, ensuring traceable coverage valuable for construction QA/QC and digital-twin workflows.
Evaluation across three distinct case studies, a simulated steel bridge, a simulated industrial facility, and a real-world indoor construction site, demonstrated substantial reductions in camera usage, ranging from 31 to 63 percent, along with significant mission time savings (up to 50%). These benefits were achieved with only minor trade-offs in reconstruction accuracy. The approach was computationally efficient, adaptable to both synthetic and real environments, and directly executable in multi-flight missions through battery-constrained epoch partitioning. This capability enabled complete coverage of large or complex structures while maintaining feasibility within UAV endurance limits. The resulting gains translated into fewer images to process, fewer battery swaps, and leaner inspection missions, ultimately reducing operational overhead while maintaining the fidelity required for as-built verification.
Several directions exist for further development. These include validating performance in outdoor environments affected by wind, lighting variability, and GNSS degradation and extending the system to support cooperative or multi-UAV planning for larger infrastructure. Future work should also explore component-weighted optimization for prioritizing critical structural elements, as well as tighter integration with cloud-based digital twin platforms and automated links to progress-monitoring systems. With these enhancements, the framework can evolve into a fully autonomous, field-ready UAV inspection tool that bridges remote sensing, robotics, and digital construction—delivering pre-planned, auditable, and data-driven inspection outputs for QA/QC, maintenance, and infrastructure management.
Future work may extend the framework to incorporate environment-aware robustness into the optimization and execution phases. Potential enhancements include increasing redundancy in wind-exposed or GNSS-challenged regions to expand voxel-based collision buffers to account for trajectory-tracking uncertainty and to adjust TSP edge costs based on predicted wind loads or signal degradation. Another promising direction is to couple the offline ILP plan with an online controller or SLAM/VIO-based local replanner, enabling the UAV to adhere to the BIM-aware global path while dynamically adapting to real-world disturbances during flight.
Ultimately, future work will incorporate mechanisms to handle uncertainties in BIM geometry. Potential extensions include integrating incremental as-built updates from photogrammetric or LiDAR scans and performing visibility and ILP optimization over ensembles of perturbed models to improve robustness. Another promising direction is to fuse BIM geometry with SLAM- or VIO-based local mapping to dynamically correct misaligned or outdated components during inspection. These additions will enable the system to operate reliably even when the BIM does not perfectly match the real construction site.