Abstract
Ensuring the quality and structural stability of industrial steel buildings requires precise geometric control during the execution stage, in accordance with assembly standards defined by EN 1090-2:2020. In this context, this work proposes a methodology that enables the automatic detection of geometric deviations by comparing the intended design with the actual as-built structure using a Terrestrial Laser Scanner. The integrated pipeline processes the 3D point cloud of the asset by projecting it into 2D images, on which a YOLOv8 segmentation model is trained to detect, classify and segment commercial steel cross-sections. Its application demonstrated improved identification and geometric representation of cross-sections, even in cases of incomplete or partially occluded geometries. To enhance generalisation, synthetic 3D data augmentation was applied, yielding promising results with segmentation metrics measured by mAp@50-95 reaching 70.20%. The methodology includes a systematic segmentation-based filtering step, followed by the computation of Oriented Bounding Boxes to quantify both positional and angular displacements. The effectiveness of the methodology was demonstrated in two field applications during the assembly of industrial steel structures. The results confirm the method’s effectiveness, achieving up to 94% of structural elements assessed in real assemblies, with 97% valid segmentations enabling reliable geometric verification under the standards.
1. Introduction
The Architecture, Engineering and Construction (AEC) sector has seen substantial advancements in the integration of digital technologies, including Building Information Modelling (BIM), Terrestrial Laser Scanning (TLS) and automated quality control tools. These innovations have transformed conventional tasks such as geometric quality control and structural inspection into faster, more reliable and increasingly automated digital workflows [1,2,3]. Despite this progress, many construction activities still rely on traditional visual inspections and selective manual measurements carried out by field inspectors, which may be inadequate for precise geometric assessments. A more systematic, data-driven approach involves using high-resolution spatial data captured through remote-sensing technologies. Among these, TLS is a static, close-range application of Light Detection and Ranging (LiDAR) principles, measuring distances by either the Time-of-Flight (ToF) measured by the delay between emitted and reflected pulses or by shifts in the laser wavelength [4]. This technique offers a cost-efficient approach to acquiring extensive geometric and visual data across the region of interest (ROI) [5].
In the current digitalisation era, laser scanners are extensively used for rapid and accurate object measurements through point cloud analysis. These point clouds capture the detailed geometry of an asset’s surface with high precision [6]. However, TLS data face several practical limitations. Oytun and Atasoy [7], using a Leica ScanStation P40, found that crack-width measurements are strongly influenced by material type, scanner-to-object range and angle of incidence. The authors concluded that cracks with widths below 2 mm could not be reliably measured regardless of scan resolution. Chacón, et al. [8] used TLS to capture “As-Built” point clouds of laboratory-scale stainless-steel frames subjected to vertical and lateral loads. These point clouds were converted into BIM-compatible wireframe models that preserved the measured deformations of known structural members for subsequent nonlinear analyses. However, stationary laser scanners are limited by line-of-sight constraints, leaving elevated or occluded areas unscanned. To overcome this, ref. [9] combined static TLS with a UAV-mounted Aerial Laser Scanner (ALS), which applies the same LiDAR technology but extends coverage through aerial surveying. In their circular-plan water-tower case study, TLS captured the lower sections, while the aerial survey covered the upper and occluded areas, albeit with lower precision. By slicing the merged point cloud into horizontal cross-sections and fitting ideal circles to each slice, the authors quantified radial deviations from the nominal geometry, reporting an average error of 13 mm. This hybrid approach combines the millimetric accuracy of TLS for ground-level surfaces with the broader coverage of ALS for areas beyond TLS reach.
Accurate alignment between TLS-acquired point clouds and digital models is essential for geometric assessment. When available, Ground Control Points (GCPs) provide the most rigorous spatial referencing across both laser scans and design models. However, in many cases, such as industrial construction sites, GCP deployment is either impractical or incomplete. In this context, local registration processes become crucial. Among them, the Iterative Closest Point (ICP) algorithm, introduced by [10], estimates rigid-body transformations by iteratively minimising distances between corresponding point pairs, though its accuracy depends on the initial alignment. To address this limitation, hybrid methods now combine global and local registration stages. For instance, ref. [11] developed an enhanced alignment pipeline integrating Sample Consensus Initial Alignment (SAC-IA) for coarse registration with a bidirectional KD-tree-optimised ICP (KD-ICP) for refinement. Other variants, such as the GF-ICP, integrate geometric features such as surface curvature and point density, reducing the dependence on an initial alignment [12].
Beyond measurement-oriented TLS applications, recent studies have explored geometry-driven approaches that interpret point cloud data at a structural level. In this context, ref. [13] proposed a framework for deriving simplified wireframe-based structural representation from point cloud data, focusing on the inference of structural elements to support condition assessment of civil infrastructures. While existing approaches primarily focus on reconstructing structural representations directly from point clouds, there remains a lack of automated, regulation-oriented methodologies specifically designed for geometric conformity verification.
Although point clouds are highly effective for detailed 3D modelling and spatial analysis, certain applications—especially those requiring faster processing or simpler data representation—benefit from 2D imagery. A key strategy involves projecting 3D point clouds onto 2D image planes. He, et al. [14] encoded colour and depth into standard 2D image formats, reducing data complexity without losing relevant geometry. Similarly, ref. [15] combined this projection technique with Deep Learning (DL) models for UAV-based bridge inspections, significantly improving crack detection by mitigating background interference and misclassification.
Recent studies have integrated DL algorithms into point cloud processing, enabling more automated and accurate workflows. In 2017, Qi et al. introduced PointNet [16], a pioneering DL model that directly processes unordered point sets. Subsequent architectures, like PointNet++ [17], OctNet [18], PointVGG [19] and PointGrid [20] incorporated hierarchical feature extraction and spatial partitioning, improving 3D object classification. Furthermore, SpiderCNN [21], PointCNN [22] and Dynamic Graph CNN (DGCNN) [23] extended these capabilities to segmentation and feature extraction, enhancing tasks such as point cloud segmentation, object detection, construction progress monitoring and structural integrity assessment.
In recent years, deep learning frameworks have been integrated into image-based object detection algorithms, relying on bounding-box representations to localise and classify features. Among these, Horizontal Bounding Boxes (HBBs) remain prevalent due to their simplicity and efficiency [24]. Modern detectors are typically divided into two-stage and single-stage approaches. Two-stage networks, such as Faster R-CNN, first generate region proposals using a Region Proposal Network (RPN) before refinement and classification. Fu, et al. [25] embedded the StairNet module within a Faster R-CNN architecture to detect pavement cracks, outperforming conventional CNNs, including GoogLeNet [26], VGG16 [27], ResNet34 [28] and MobileNet [29,30]. Single-stage detectors, in contrast, merge region proposal and classification into a unique CNN step. The You Only Look Once (YOLO) series exemplifies this, achieving real-time detection, precise localisation and class prediction [31]. The YOLO process comprises the following: (i) grid division of the input image, with each cell responsible for predicting object localisation (bounding box) and confidence scores (class probabilities); (ii) bounding box and class prediction; and (iii) final detection using class probabilities per grid cell [32,33]. Successive versions—up to YOLOv13—have progressively enhanced accuracy and performance [34].
Some applications in the civil engineering sector have demonstrated impressive performance when applying YOLO to multiple computer vision tasks [29,30,35,36]. In the domain of semantic image segmentation, YOLOv5 and YOLOv8, developed by Ultralytics, represent two of the most widely utilised approaches. Casas, et al. [37] conducted a comparative study assessing their respective efficiency, robustness and adaptability for complex engineering tasks. The researchers found YOLOv8 outperformed YOLOv5 in terms of speed, accuracy and generalisation capabilities. Kumar, et al. [38] employed an earlier version, YOLOv3, for real-time detection of cracks and spalling on concrete structures. This methodology used HBBs to localise damage, successfully detecting cracks wider than 0.20 mm from 400 mm away. Similarly, ref. [39] used YOLOv5 to develop a scalable tool for bridge maintenance and condition monitoring, capable of detecting multiple defects in reinforced concrete bridges. Additionally, ref. [40] applied YOLOv5 to detect and classify efflorescence on brick facades, enabling automatic differentiation between bricks that require simple cleaning and those requiring structural repair.
However, HBBs cannot always accurately represent objects with arbitrary orientation. To address these limitations, researchers developed advanced mathematical approaches for representing Oriented Bounding Boxes (OBBs), which can rotate to precisely match the orientation of an object. In contrast to HBBs, OBBs offer the flexibility to align the bounding box with the object’s orientation, resulting in a more accurate and compact representation of its actual shape [41,42,43]. This approach has enabled various applications in fields where objects may appear at arbitrary orientations, such as remote sensing for aerial and satellite imagery [41,42,44], text detection [45] and 3D object detection [46,47]. Nonetheless, these methods adopt a five-parameter representation—consisting of the object’s centre coordinates, width, height and rotation angle—which can lead to angular discontinuities and parameter inconsistencies. Qian, et al. [48] addressed the OBB challenges by shifting from the common five-parameter labels to an eight-parameter corner-point regression model that directly predicts the four vertex coordinates of the OBB. An alternative approach is to frame OBB detection as a segmentation task rather than a direct coordinate regression. For instance, ref. [42] introduced the Mask-OBB, a semantic-attention mask representation that transforms OBB estimation into a pixel-level classification problem. The methodology uses predicted masks to generate precise OBBs, thereby effectively avoiding the boundary ambiguity and discontinuity problems. Conversely, ref. [44] developed a methodology specifically designed for high-resolution remote sensing imagery, aiming to address the challenges of segmenting complex object boundaries.
This paper presents a methodology that integrates TLS and DL techniques to automate the geometric control of industrial steel structures during assembly. The proposed approach addresses limitations of conventional manual inspection and heuristic-based procedures through the following key contributions:
Use of terrestrial laser scanning and deep learning to replace traditional geometric control methods that rely on manual, error-prone measurements.
Automation of wireframe extraction from IFC models, enabling precise alignment with point cloud data.
YOLOv8-based instance segmentation of 2D steel cross-sectional projections, enabling reliable identification of commercial steel profiles even under noise, occlusion and incomplete geometry conditions [49].
A synthetic 3D data augmentation strategy to generate projection datasets for YOLOv8 training, representative of real-world conditions.
Comprehensive field validation following EN 1090-2:2020 [50] criteria for geometric assembly control of industrial steel structures.
2. Methodology
This section provides an overview of the proposed framework for the automated geometric control of industrial steel structures. The framework integrates high-resolution 3D point cloud data, captured using a TLS, and structural information from IFC models to enable accurate comparisons between design and actual construction. Additionally, for improved automation and accuracy, a custom YOLOv8 segmentation model was trained to detect and classify cross-sectional steel profiles from 2D projections of the 3D point cloud data. The YOLOv8 model enabled robust feature extraction for the automatic assessment of geometric deviations according to EN 1090-2:2020. The methodology comprises four main stages: (i) data acquisition, (ii) structural processing and data extraction, (iii) YOLOv8 segmentation and identification of steel profiles, and (iv) geometric analysis and EN 1090-2:2020 validation. Figure 1 provides an overview of the complete methodology.
Figure 1.
Overview of the proposed methodology.
2.1. Data Acquisition
Accurate geometric control begins with reliable access to two essential data sources: the 3D point cloud model, which reflects the current physical state of the structure, and the IFC model, which contains the intended design configuration. These models enable direct geometric comparison between the as-built and as-designed conditions. To support this, both datasets are aligned within a shared geospatial reference system, forming a reliable foundation for subsequent geometric deviation analysis.
In order to ensure the highest standards regarding data quality and acquisition conditions, the TLS equipment and survey settings used to capture the As-Is model are described in the following subsection.
2.1.1. Equipment and Acquisition Settings
The As-Is structural configuration was geometrically captured using terrestrial laser scanning, relying on commercial laser scanning systems that are suitable for construction-scale geometric monitoring and quality control. The methodology was tested with two laser scanners: the Leica BLK360 G1 and the Leica RTC360 (Leica Geosystems, Heerbrugg, Switzerland). Table 1 summarises the main technical characteristics of the two TLS. Although both scanners are based on ToF ranging and provide full panoramic coverage, their technical characteristics differ in terms of ranging accuracy, acquisition speed and operating range.
Table 1.
Main technical specification of the equipment used, as reported by [51].
The Leica BLK360 G1 (Figure 2a), characterised by its low weight and compact form factor, offers a nominal ranging accuracy of ±4 mm at 10 m and an effective maximum range of approximately 60 m. Under these specifications, the scanner produces point cloud data of sufficient geometric fidelity to support the subsequent processing stages.
Figure 2.
Terrestrial laser scanners employed: (a) Leica BLK360 G1; (b) Leica RTC360 (adapted from [51]).
The Leica RTC360 (Figure 2b) provides an enhanced ranging performance, combining nominal ranging accuracy of ±1 mm at 10 m with an effective maximum range of 130 m and high measurement rate. A distinguishing feature of this system is the integration of a Visual-Inertial System (VIS), which uses video-based motion tracking with inertial measurements to support real-time, targetless registration between consecutive scan positions. This potentiality improves registration stability and reduces cumulative alignment errors during large surveys.
The accuracy of the data fundamentally depends on careful pre-survey planning and scanner configuration. This includes determining the optimal number of scans and strategically positioning the scanner to ensure full visibility of the steel element cross-sections. Due to the complex geometry of steel profiles, inappropriate scan positioning may result in occlusions and incomplete sampling of critical geometric features. Figure 3 represent the influence of scan positioning on data quality. Figure 3a highlights a non-optimal acquisition scenario, where the interaction between the TLS light ray and the geometry of the steel profile leads to partial or distorted cross-sectional information. In contrast, Figure 3b presents the proposed scanning strategy to ensure full visibility and accurate capture of the complete geometric configuration of the steel element.
Figure 3.
Influence of TLS scan positioning on cross-sectional data quality: (a) non-optimal scanner position; (b) optimised scanning configuration ensuring maximum visibility and accurate acquisition.
2.1.2. As-Is Model
The As-Is 3D model is generated from a TLS survey using medium-density scanner settings. An iPad equipped with a LiDAR sensor captures additional detail in hard-to-reach areas, and real-time pre-alignment is performed in Leica Cyclone FIELD 360.
For post-survey processing, the raw scans are imported into Leica Cyclone REGISTER 360 PLUS for global registration and alignment, minimising linking errors. The registered point cloud is georeferenced using on-site GCPs obtained from a topographic survey to ensure consistency within the shared geospatial system. Irrelevant components are then removed, retaining only the structural elements. The dataset is optimised through radius-based outlier removal and voxel down-sampling to reduce noise and density while preserving geometric features.
2.1.3. As-Design Model
The As-Design data analysis process begins with the extraction of geometric and structural features from the IFC file describing the BIM asset. An algorithm processes the IFC data, extracting element information, including columns and beams, calculating precise endpoints, node coordinates, lengths and related metadata. Principal Component Analysis (PCA) [52] is then applied to determine the principal axis of each object in the model and compute its 3D node coordinates, as shown in Figure 4. Vertex coordinates are arranged in a matrix , containing each vertex’s normalised 3D spatial coordinates after mean subtraction. The covariance matrix is calculated to reveal the object orientation, as in (1).
Figure 4.
PCA application for wireframe extraction.
This matrix is decomposed into eigenvectors and eigenvalues using , where defines the principal-axis directions and represents the variance of each principal component (PC). Geometrically, the transformed data matrix (score matrix) projects the original data onto PCs, from which node coordinates are derived to reconstruct the wireframe of elements .
2.2. Structural Processing and Data Extraction
Following the data acquisition phase, both data sources—the As-Is point cloud and the As-Designed wireframe—are processed together in an integrated workflow. Operating within a shared geospatial reference system, the wireframe provides the geometric centrelines of structural elements, which are used to slice the As-Is point cloud into multiple parallel double planes oriented perpendicularly to each element axis. The resulting 3D point cloud segments, located between each double-plane pair, are then projected into 2D cross-sectional images for each structural element. This process is repeated along each element axis, enabling data extraction at multiple key positions. The overall procedure is schematically illustrated in Figure 5. Each generated image is stored with a standardised resolution and a fixed metric scale for consistency across all sections, following a structured naming convention including element ID, section type and projection index. In the adopted configuration, the projection window corresponds to a 1 m × 1 m area, with a constant image resolution, resulting in a known pixel-to-metric relationship (1 pixel ≈ 1 mm). This enables subsequent processing and comparison to be performed in true scale, independent of visual representation. The proposed workflow remains scalable for both real and synthetic point clouds, provided a shared geospatial reference system is maintained.
Figure 5.
Procedure of projecting 3D point cloud crops onto 2D slicing plane surfaces to generate steel cross-sectional images.
At this point, it was observed that real-world cross-sectional images frequently contain noise, occlusions and incomplete regions which hinder accurate analysis and model training. To mitigate this limitation, a 3D augmentation process based on synthetic data was implemented by applying controlled perturbations to extracted 3D segments prior to projection. The structured naming convention ensures consistent application of several 3D augmentation techniques: (i) Gaussian noise injection; (ii) voxel down-sampling at multiple scales; (iii) directional point dropouts to simulate occlusions or missing points; (iv) synthetic profile insertion with standardised I-section profiles such as HEA, HEB and IPE; and (v) simulated ghost points to replicate reflection or duplication effects. The results of these augmentations are shown in Figure 6, illustrating the progression from a clean synthetic projection to increasingly augmented variations in the same cross-section class. This enriched and varied dataset of labelled 2D cross-sectional images enabled robust deep learning model training for steel cross-section segmentation and classification, as detailed in Section 2.3.
Figure 6.
Example of the 3D data augmentation process applied to a synthetic HEA and IPE cross-sections: (a) clean projection of the original synthetic segment, (b) Gaussian noise and dropout augmentation, (c) synthetic profile insertion, (d) simulated cross-section duplication.
2.3. YOLOv8 for Segmentation and Identification of Steel Profiles
In order to enhance the automation and reliability of cross-sectional analysis, a YOLOv8 instance segmentation model was trained to detect, classify and segment steel profiles’ cross-sections in 2D image projections. It should be noted that the proposed methodology does not aim to directly measure profile sectional properties from raw point cloud data. Instead, steel cross-sections are identified and classified using a deep learning model, and the corresponding geometric properties are assumed based on standardised commercial profile definitions.
Recognising structural profiles from real-world data, which often contain noise and partial occlusions, requires a robust, high-performance vision model capable of delivering fast and accurate predictions. For this reason, YOLOv8 was selected over other alternatives such as YOLOv5 and Mask R-CNN, as it offers an optimal balance of accuracy, inference speed and ease of deployment. In contrast to Mask R-CNN, a two-stage model with high segmentation accuracy but also high computational cost, YOLOv8 functions as a one-stage, anchor-free detector with an integrated segmentation head based on a prototype mask. Its architecture integrates a multi-scale backbone and optimised head, ensuring fast and precise segmentation. YOLOv8 was chosen to align with the proposed methodology, designed for standard-hardware deployment and scalable analysis during structural assembly.
An integrated overview of the YOLOv8 segmentation architecture is presented in Figure 7, illustrating the complete processing pipeline from raw 2D cross-sectional inputs to final instance-segmentation outputs, including class labels, HBBs and segmentation masks. This approach enables direct recognition of structural elements despite partial occlusions inherent to digital acquisition, reducing dependence on manual profile identification or heuristic reconstruction.
Figure 7.
YOLOv8 segmentation pipeline.
2.3.1. Dataset and Annotation
The YOLOv8 model was trained on a custom dataset combining both augmented synthetic projection images and real data acquired through LiDAR-based surveys. The dataset includes 100 distinct classes representing structural steel cross-sections from the HEA, HEB and IPE series, each considered in two orientations: a standard configuration and a rotated variant, labelled with the suffix “R” (e.g., HEA100 and HEA100R). This range covers profiles ranging from 100 to 500 mm in height in each family, enabling the model to recognise and classify the full spectrum of commonly used steel sections. Orientation-based labelling follows engineering practice, as rotation influences structural behaviour and enables the validation of correct placement during assembly.
For rigorous geometric analysis, a systematic data annotation procedure was established. Images were first organised by class, each beginning with a clean synthetic image to ensure accurate initial labelling. Roboflow [53] enabled annotation duplication across images, ensuring consistent labelling of synthetic and real data and preserving geometric consistency. This systematic annotation process resulted in a robust model capable of recognising commercial steel cross-sections under multiple geometric and orientation conditions.
2.3.2. Training Strategy and Prediction Results
The training process aimed to maximise the model’s generalisation and robustness through a mix of clean and augmented synthetic images, supplemented by real data from several field campaigns. Optuna [54] was used for hyperparameter tuning, and training employed the AdamW optimiser, which adjusts learning rates and applies weight decay to stabilise gradient updates and reduce overfitting. Initially trained solely on synthetic data, the model showed strong performance on low-noise real images but reduced accuracy on complex cases, revealing the need for fine-tuning. A second training cycle incorporated real projections and additional synthetic augmentation while retaining previously learned features. Table 2 summarises the main training hyperparameters, including device configuration, dataset split, image counts, resolution, batch size, epochs and optimiser settings.
Table 2.
Overview of the YOLOv8 training hyperparameter configuration.
The model’s predictions include both the profile class identification and object segmentation, enabling localisation and geometric analysis on previously unseen images. Its ability to generalise to real data stems from the diverse and extensive augmentation strategy. Furthermore, YOLO’s output predictions systematically store geometric data for each detection, including HBB coordinates that define the axis-aligned rectangular region enclosing the object, and segmentation mask coordinates which provide a pixel-level delineation of its shape and position. Table 3 presents the main validation metrics obtained from the final training configuration, including bounding box (B) and segmentation mask (M) performance, showing consistently high precision (P), recall (R) and mAP scores across both tasks. Figure 8 illustrates several examples of the model’s predictions applied to real-world cross-sectional projections.
Table 3.
Validation metrics of the final YOLOv8 segmentation model.
Figure 8.
YOLOv8 segmentation results on real 2D cross-sectional images with confidence levels: (a) IPE330 (0.86), (b) IPE200R (0.96), (c) IPE180 (0.93, rotated), (d) IPE220 (0.91, noisy), (e) HEA180 (0.97, base cut revealing the bolt connections), (f) HEB140R (0.94), (g) IPE330R (0.92), (h) HEB200 (0.93, ignoring the L-section).
2.3.3. Prediction-Based Filtering and OBB Computation
Accurate geometric analysis of point cloud projections requires post-processing computer vision methods to isolate key features and improve data quality. Here, segmentation-driven filtering is integrated with OBB computation for precise characterisation. Each segmentation mask generated by the model is overlaid on the corresponding image to isolate the object. A binary mask assigns a value of 1 to the object and 0 to the background, effectively removing noise. This processed output forms the basis for accurate downstream geometric analysis.
The resulting filtered image is processed using the OpenCV library [55] to extract geometric boundaries of the segmented object. Specifically, the findContours function detects the object’s outer edges and generates detailed contour representations. These contours are then processed with the minAreaRect function to compute the minimum-area rectangle enclosing the object, yielding an OBB that compactly represents the profile’s shape and rotation. The entire pipeline is exemplified in Figure 9, which illustrates each processing stage.
Figure 9.
Step-by-step visualisation of the segmentation and geometric analysis pipeline: (a) original raw 2D cross-sectional image, (b) segmentation mask generated by the YOLOv8, (c) filtered image with the isolated section, (d) computed OBB, with red dot representing the OBB centroid.
2.4. Geometric Analysis and EN 1090-2:2020 Validation
Geometric quality control under EN 1090-2:2020 is assessed by measuring deviations between the executed structure (As-Is model) and the design intent (As-Design model). The proposed methodology validates steel structural elements against five key geometric deviation criteria defined in the standard, encompassing planimetric positioning, column inclination and straightness and beam inclination and straightness. These criteria were selected for their structural relevance and ability to represent overall geometric conformity; however, EN 1090:2-2020 defines additional tolerances for other assembly conditions not addressed in this study. The applicable deviation for each criterion depends on the execution class and element length (l) or height (h). Table 4 summarises the corresponding tolerance limits and geometric scenarios defined in EN 1090-2:2020.
Table 4.
Geometric deviation limits defined in EN 1090-2:2020 for assembly of structural steel elements.
The proposed methodology employs a centroid-based analysis of the computed OBB, extracted from the segmented-filtered images. Two comparison strategies are implemented, depending on the specific normative criterion considered.
- (a)
- Internal comparison within local element—This strategy evaluates consistency along a single element rather than its absolute position in the global structure. Centroids from different cross-sectional slices of the same element are compared without reference to the As-Design model. It applies to criteria C2, C3, C4 and C5.
- (b)
- Direct comparison with the As-Design model—This strategy involves comparing each OBB centroid at the bottom to its corresponding reference point in the As-Design wireframe. It is specifically used to assess planimetric positioning accuracy (criterion C1).
The analysis measures both translation displacements (Δx, Δy) between the As-Design reference centroid and the As-Is centroids. The validation process focuses on three distinct regions of each element: the Bottom (slices 1–20), the Mid (slices 40–60) and the Top (slices 80–100). These intervals cover structurally relevant sections and ensure spatial coverage and methodological robustness, even when data are incomplete or partially predicted. Figure 10 illustrates the two centroid-based measurement strategies defined for geometric validation following EN 1090-2:2020. Each criterion is assessed within predefined slice ranges corresponding to its geometric purpose: C1 uses strategy (b) on the bottom-region relative to its As-Design reference; C2-C5 use strategy (a), where C2 compares bottom and top centroids; C3 evaluates mid-span deviations along the element axis; C4 analyses left- and right-side elevations; and C5 examines mid-span within the element axis. The effectiveness of this strategy depends on accurate alignment between the As-Is and As-Design models.
Figure 10.
Geometric measurement strategies based on OBB centroid: (a) internal comparison within local element, (b) direct comparison with the As-Design model.
Figure 11 summarises the proposed automated workflow for geometric control of industrial steel structures. The process integrates multiple data sources and analytical modules into a continuous pipeline that links acquisition, processing, recognition and validation. The As-Is point cloud model and the As-Design information are jointly processed to generate a standardised 2D projection, enabling the YOLOv8 segmentation model to detect and classify structural cross-sections automatically. Rather than relying on direct estimation of sectional dimensions from point cloud geometry, the workflow establishes cross-sectional shapes through YOLOv8-based classification of standardised steel profiles.
Figure 11.
Flowchart of the proposed methodology.
The integration of AI-based recognition with rule-based geometric validation enables each element to be assessed according to its typology (IfcColumn or IfcBeam) and verified against the applicable EN 1090-2:2020 tolerance criteria. This workflow consolidates all intermediate steps—from data alignment to deviation quantification—into a single, automated framework for consistent quality control.
3. Application and Discussion
This section presents the results derived from applying the proposed geometric control methodology to two real-world case studies. The analysis is focused on verifying their geometric compliance based on the EN 1090-2:2020.
3.1. Case Studies
The methodology was tested on two industrial steel assembly case studies located in Amarante and Maia (Porto district, Portugal), designated as Case I and Case II, respectively. Figure 12a,b illustrates the structural assembly stages during the surveys. Case I, evaluated at a more advanced execution phase, had all columns and most beams installed, supporting a comprehensive element-based validation. Case II, captured at an earlier phase of construction, with only columns and partial bracings, presented greater geometric complexity due to its inclined façades.

Figure 12.
Survey data and model representations: (a) Case I—Amarante, (b) Case II—Maia, (c) Case I—captured 3D point cloud, (d) Case II—captured 3D point cloud, (e) Case I—As-Design wireframe reconstruction, (f) Case II—As-Design wireframe recontruction, (g) Case I—structural elements validated under EN 1090-2:2020, with analysed elements highlighted in red, (h) Case II—structural elements validated under EN 1090-2:2020, with analysed elements highlighted in red.
Both surveys used TLS to obtain high-resolution 3D point clouds of the structure. Two different pieces of equipment were used: Leica BLK360 G1 for Case I and Leica RTC360 for Case II. Real-time pre-alignment in Cyclone FIELD 360 and subsequent bundle adjustment in Cyclone REGISTER 360 PLUS minimised registration errors. The processed point clouds are shown in Figure 12c,d.
Referential theoretical geometries were derived from As-Designed IFC models through the application of a PCA-based wireframe extraction algorithm. Figure 12e,f depicts the resulting wireframe reconstruction models for both cases. Georeferencing required two different approaches: for Case I, lacking georeferentiation, alignment between the As-Is and As-Designed models was achieved with an ICP-based registration, yielding mean and median residual errors of 2.0 cm and 1.3 cm, respectively; for Case II, a georeferenced IFC was provided and on-site GCPs were deployed, allowing direct CRS alignment and a more accurate geometric comparison. Figure 12g,h highlights the structural elements assessed under EN 1090-2:2020. These were validated according to the five geometric-deviation criteria listed in Table 4.
3.2. Geometric Control
3.2.1. Case I—Industrial Building in Amarante
The geometric validation for Case I focused on representative columns and beams on two of the four identical modular halls, ensuring spatial coverage and structural representativeness. Partition walls, placed between some columns, restricted access to certain elements, reducing the analysis to the ones fully captured. The quantitative results are summarised in Table 5, which consolidates the measured deviation of the selected elements under the five geometric criteria defined in EN 1090-2:2020.
Table 5.
Measured geometric deviations for selected structural elements in Case I according to EN 1090-2:2020.
Planimetric positioning of the columns at their base was evaluated under Criterion C1. Elements, such as P.213 and P.216, remain within the ±1 cm Class 1 tolerance. Columns P.194 and P.200 exhibit slightly larger offsets along both directions, consistent with minor residual deviations from the ICP-based alignment. The schematic distribution and magnitude of these deviations are shown in Figure 13a.
Figure 13.
Visual representation of geometric deviation assessment for Case I—Amarante: (a) Criterion C1—Planimetric positioning of columns, (b) Criterion C2—column inclination, (c) Criterion C3—column straightness, (d) Criterion C4—beam vertical inclination, (e) Criterion C5—beam mid-span straightness.
Criterion C2 evaluates column inclination by measuring horizontal displacements between base and top centroid regions. The evaluation considered both X and Y directions to verify uniform vertical alignment across planimetric axes. For 7.35 m high columns, the Class 1 tolerance of ±2.4 cm was adopted. Most results exceed the admissible tolerances; nevertheless, the analysis effectively identified and quantified this inclination behaviour, as illustrated in Figure 13b, where the blue and green highlighted zones correspond to the top and base regions of the column, respectively.
The straightness criterion quantifies the maximum perpendicular distance (Δm) of the mid-region from the line connecting the base and top centroids. All tested columns remained below the ≈0.7 cm tolerance limit, confirming the high sensitivity to small displacements, capable of detecting variations in the millimetre range, as evidenced in Figure 13c, where the middle zone is represented in yellow.
Criteria C4 and C5 extend the geometric evaluation to beam elements, addressing vertical inclination and mid-span straightness, respectively. The assessment of vertical inclination considered the end-to-end height difference (Δy) between beam extremities, limited by the Class 1 tolerance of ±/500 and limited to a maximum of 1 cm, for the 7.35 m member. Mid-span straightness was analysed through the maximum perpendicular deviation (Δm) of the central region from the line connecting both ends, with an admissible limit of ±1.5 cm. As shown in Table 5 and Figure 13d,e, most beams satisfy these requirements. Minor exceedances in B.125 and B.133 indicated local execution deviations, while the remaining element display sub-centimetric accuracy. The reliable detection of these aerial elements confirms the suitability of TLS surveys for precise beam-level validation, indicating the methods adaptability.
Overall, Case I demonstrates satisfactory compliance with EN 1090-2:2020. The methodology consistently delivered accurate interpretation of the results across all evaluated criteria, proving effective even under partial visibility and non-georeferenced conditions.
3.2.2. Case II—Industrial Building in Maia
The survey for Case II was conducted at an earlier assembly stage, with only columns and partial bracing installed. During acquisition, on-going cross-bracing installation caused slight dynamic effects on the structure, minimally impacting the scan consistency on some elements. The inclined façades introduced additional geometric complexity, providing a suitable context to test the methodology. The analysis focused on the column elements, evaluated under criteria C1—C3 of EN 1090-2:2020, and the results are synthesised in a stable vertical behaviour across the structure. All elements stayed under the Class 1 tolerance, except P.53, which presented a 3.7 cm deviation along the Y direction. This deviation likely resulted from incomplete lateral bracing at the time of the survey. In general, the measured displacements showed better performance compared with Case I, see Table 6.
Table 6.
Measured geometric deviations for selected structural elements in Case II according to EN 1090-2:2020.
Under Criterion C1, the methodology demonstrated consistent performance, with most elements remaining within the tolerance. Columns P.77 and P.81 exhibited marginal horizontal shifts following the global direction of the façade. Nevertheless, the results confirm that direct georeferencing improved positional accuracy and reduced dispersion, as shown in Figure 14a.

Figure 14.
Visual representation of geometric deviation assessment for Case II–Maia: (a) Criterion C1—Planimetric positioning of columns, (b) Criterion C2—column inclination, (c) Criterion C3—column straightness.
The assessment of column inclination revealed a stable vertical behaviour across the structure. All elements stayed under the Class 1 tolerance, except P.53, which presented a 3.7 cm deviation along the Y direction. This deviation likely resulted from incomplete lateral bracing at the time of the survey. In general, the measured displacements showed better performance compared with Case I. The analysis of straightness confirmed negligible deformation along the column axes. All measured mid-region deviation (Δm) remained well below the tolerance limit. Consistent with Case I, these results demonstrate fabrication accuracy and the methodology’s sensitivity to millimetric variations. Figure 14b,c presents two representative examples for both criteria.
4. Conclusions
The integration of point cloud data and DL algorithms into construction workflows represents a major advancement in geometric quality control, particularly in automating normative compliance checks. This study presents a novel framework for the automated verification of geometric conformity in accordance with EN 1090-2:2020, combining laser scanning and AI-driven object detection through the YOLOv8 algorithm for steel cross-section analysis. The methodology was tested in two real-world case studies at different assembly stages using two laser scanning systems. Both proved suitable, with scanner positioning emerging as a key factor in capturing accurate element geometry.
The YOLOv8 model was trained on more than 48,501 synthetic and real images, supported by 3D data augmentation techniques and a carefully structured annotation process. Unlike traditional methods, this approach enables comprehensive geometric segmentation while compensating for missing or occluded data. Although the current version of the model is limited to three steel profile series with 100 subclasses, it generalises well across varied acquisition conditions. In Case I, the methodology successfully assessed the geometric displacements of 77% of the captured elements. The lower rate was due to limited accessibility and partition walls that restricted full geometry acquisition. In contrast, Case II achieved a higher assessment rate, with 94% of the columns successfully evaluated. Detected deviations verified conformity with five normative rules. On average, three outliers per element (97% valid data) arose, mainly from overlapping connections affecting model inference.
The model’s performance metrics confirm its robustness. The two-stage training strategy—initially synthetic, then fine-tuned with real images—proved effective for accurate, generalisable predictions. The final model achieved a precision of 94.47% and a recall of 95.84% for both the bounding box and segmentation mask outputs. In terms of segmentation performance, the model reached a mAP@50 of 98.45% and a mAP@50-95 of 70.20%. These results validate the synthetic dataset and 3D augmentation as a solid foundation, with real images being crucial for performance on noisy or occluded data.
While the alignment between TLS-derived point clouds and digital models typically benefits from the use of GCPs, this methodology demonstrated robust performance even in the absence of such a reference system. This reinforces the flexibility and practical applicability of the approach in real-world construction environments.
With an expanded dataset, the framework can be extended to a wider range of steel profiles and other structural materials through dataset expansion and refined annotation. Its flexibility is relevant for both heritage rehabilitation, such as timber framing, and modern Light Steel Framing (LSF) systems, although the applicable standards and tolerance criteria would differ from those defined in EN 1090-2:2020. While this study focused on validating five geometric conformity criteria from EN 1090-2:2020, the framework can integrate additional tolerance criteria for both assembly and fabrication phases. It also supports consistent identification and placement of structural profiles, reducing systematic errors and preserving design intent.
From a practical engineering perspective, the main contribution of this study is the establishment of an end-to-end digital pipeline that replaces conventional manual geometric inspection with an automated, repeatable and auditable process. The proposed Scan-2D Projection-YOLOv8-Validation workflow integrates terrestrial laser scanning, 2D cross-sectional projection, deep learning-based profile identification and automated validation against EN 1090-2:2020. This enables objective and regulation-oriented geometric quality control during steel structure assembly. This approach has been demonstrated to reduce reliance on subjective visual inspection and manual measurements. Furthermore, it has been shown to enhance inspection efficiency and safety. It also provides traceable evidence of geometric conformity, which is suitable for integration into contemporary digital construction workflows.
Future developments may involve the integration of UAV-mounted LiDAR (ALS) to enhance geometric inspection of elevated or inaccessible components. The methodology could also evolve into a real-time tool for on-site verification tools, particularly with augmented reality (AR) systems, enabling operators to visualise geometric displacements during assembly. Additionally, coupling the framework with BIM-based digital twins could enable real-time structural monitoring and dynamic updates to As-Is models based on verified deviations.
Author Contributions
Conceptualization, J.V. and D.R.; methodology, J.V.; software, J.V. and J.M.; validation, J.V., J.M., T.J., P.O., R.S., R.C., R.F.M., R.O. and L.A.; formal analysis, J.V. and D.R.; investigation, J.V., J.M., R.S., R.F.M., R.O. and D.R.; resources, L.A. and D.R.; data curation, J.V.; writing—original draft preparation, J.V.; writing—review and editing, J.M., T.J., P.O., R.S., R.C., R.F.M., R.O., L.A. and D.R.; visualization, J.V.; supervision, D.R.; project administration, D.R.; funding acquisition, L.A. and D.R. All authors have read and agreed to the published version of the manuscript.
Funding
The authors want to acknowledge the financial support of I&D project CHECK METALLIC, financed by J.F Metal—Metalomecânica, S.A., through the SIFIDE, Sistema de Incentivos Fiscais à Investigação e Desenvolvimento Empresarial. Also, this work was financially supported by: UID/04708/2025 and https://doi.org/10.54499/UID/04708/2025, of the CONSTRUCT—Instituto de I&D em Estruturas e Construções—funded by Fundação para a Ciência e a Tecnologia, I.P./MECI, through the national funds.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Code and datasets may be available upon request.
Acknowledgments
The authors gratefully acknowledge Eng. Óscar Moreira for his assistance and support during all experimental campaigns. They also wish to thank Eng. António Pestana for his valuable insights, discussions and guidance throughout the development of this project. Special thanks are extended to Engs. Vitor Gonçalves and Marcos Lima, from JF METAL—Metalomecânica, for their continuous support and collaboration.
Conflicts of Interest
Author Liliana Araujo was employed by the company JF Metal. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
- Esposito, M.B.; Bosché, F. Building Information Model Pre-Processing for Automated Geometric Quality Control. In Proceedings of the 39th International Symposium on Automation and Robotics in Construction, Bogota, Colombia, 12–15 July 2022; pp. 17–24. [Google Scholar]
- Liu, J.; Cui, N.; Cheng, G.; Li, D.; Ma, X.; Liao, Y. Towards the automated virtual trial assembly of large and complex steel members using terrestrial laser scanning and BIM. Eng. Struct. 2023, 291, 116448. [Google Scholar] [CrossRef]
- Xu, Y.; Wang, Y.; Yang, J.; Zhang, J. Two-stage terrestrial laser scan planning framework for geometric measurement of civil infrastructures. Measurement 2025, 242, 115785. [Google Scholar] [CrossRef]
- Ribeiro, D.; Rakoczy, A.M.; Cabral, R.; Hoskere, V.; Narazaki, Y.; Santos, R.; Tondo, G.; Gonzalez, L.; Matos, J.C.; Massao Futai, M.; et al. Methodologies for Remote Bridge Inspection—Review. Sensors 2025, 25, 5708. [Google Scholar] [CrossRef]
- Bassier, M.; Vincke, S.; De Winter, H.; Vergauwen, M. Drift Invariant Metric Quality Control of Construction Sites Using BIM and Point Cloud Data. ISPRS Int. J. Geo-Inf. 2020, 9, 545. [Google Scholar] [CrossRef]
- Shen, N.; Wang, B.; Ma, H.; Zhao, X.; Zhou, Y.; Zhang, Z.; Xu, J. A review of terrestrial laser scanning (TLS)-based technologies for deformation monitoring in engineering. Measurement 2023, 223, 113684. [Google Scholar] [CrossRef]
- Oytun, M.; Atasoy, G. Effect of Terrestrial Laser Scanning (TLS) parameters on the accuracy of crack measurement in building materials. Autom. Constr. 2022, 144, 104590. [Google Scholar] [CrossRef]
- Chacón, R.; Puig-Polo, C.; Real, E. TLS measurements of initial imperfections of steel frames for structural analysis within BIM-enabled platforms. Autom. Constr. 2021, 125, 103618. [Google Scholar] [CrossRef]
- Lenda, G.; Marmol, U. Integration of high-precision UAV laser scanning and terrestrial scanning measurements for determining the shape of a water tower. Measurement 2023, 218, 113178. [Google Scholar] [CrossRef]
- Besl, P.J.; McKay, N.D. A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 239–256. [Google Scholar] [CrossRef]
- Cheng, Y.; Chu, H.; Li, Y.; Tang, Y.; Luo, Z.; Li, S. A Hybrid Improved SAC-IA with a KD-ICP Algorithm for Local Point Cloud Alignment Optimization. Photonics 2024, 11, 635. [Google Scholar] [CrossRef]
- He, Y.; Liang, B.; Yang, J.; Li, S.; He, J. An Iterative Closest Points Algorithm for Registration of 3D Laser Scanner Point Clouds with Geometric Features. Sensors 2017, 17, 1862. [Google Scholar] [CrossRef]
- Azizi Zade, F.; Ebrahimkhanlou, A. Point clouds to as-built two-node wireframe digital twin: A novel method to support autonomous robotic inspection. Auton. Intell. Syst. 2024, 4, 25. [Google Scholar] [CrossRef]
- He, L.; Zhu, W.; Xu, Y. Best-effort projection based attribute compression for 3D point cloud. In 2017 23rd Asia-Pacific Conference on Communications (APCC); IEEE: New York, NY, USA, 2017; pp. 1–6. [Google Scholar]
- Xiao, J.-L.; Fan, J.-S.; Liu, Y.-F.; Li, B.-L.; Nie, J.-G. Region of interest (ROI) extraction and crack detection for UAV-based bridge inspection using point cloud segmentation and 3D-to-2D projection. Autom. Constr. 2024, 158, 105226. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2017; pp. 652–660. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Riegler, G.; Osman Ulusoy, A.; Geiger, A. Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2017; pp. 3577–3586. [Google Scholar]
- Li, R.; Zhang, Y.; Niu, D.; Yang, G.; Zafar, N.; Zhang, C.; Zhao, X. PointVGG: Graph convolutional network with progressive aggregating features on point clouds. Neurocomputing 2021, 429, 187–198. [Google Scholar] [CrossRef]
- Le, T.; Duan, Y.; Recognition, P. PointGrid: A Deep Network for 3D Shape Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 9204–9214. [Google Scholar]
- Xu, Y.; Fan, T.; Xu, M.; Zeng, L.; Qiao, Y. Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the European Conference on Computer Vision (ECCV); IEEE: New York, NY, USA, 2018; pp. 87–102. [Google Scholar]
- Li, Y.; Bu, R.; Sun, M.; Chen, B. PointCNN: Convolution On χ-Transformed Points. arXiv 2018, arXiv:1801.07791. [Google Scholar] [CrossRef]
- Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. 2019, 38, 1–12. [Google Scholar] [CrossRef]
- Llerena, J.M.; Zeni, L.F.; Kristen, L.N.; Jung, C. Gaussian bounding boxes and probabilistic intersection-over-union for object detection. arXiv 2021, arXiv:2106.06072. [Google Scholar]
- Fu, R.; Cao, M.; Novák, D.; Qian, X.; Alkayem, N.F. Extended efficient convolutional neural network for concrete crack detection with illustrated merits. Autom. Constr. 2023, 156, 105098. [Google Scholar] [CrossRef]
- Szegedy, C.; Wei, L.; Yangqing, J.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2015; pp. 1–9. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Wu, Z.; Shen, C.; van den Hengel, A. Wider or Deeper: Revisiting the ResNet Model for Visual Recognition. Pattern Recognit. 2019, 90, 119–133. [Google Scholar] [CrossRef]
- Fu, R.; Zhang, Y.; Zhu, K.; Strauss, A.; Cao, M. Real-time detection of concrete cracks via enhanced You Only Look Once Network: Algorithm and software. Adv. Eng. Softw. 2024, 195, 103691. [Google Scholar] [CrossRef]
- Xu, G.; Yue, Q.; Liu, X. Real-time multi-object detection model for cracks and deformations based on deep learning. Adv. Eng. Inform. 2024, 61, 102578. [Google Scholar] [CrossRef]
- Vasanthi, P.; Mohan, L. Efficient YOLOv8 algorithm for extreme small-scale object detection. Digit. Signal Process. 2024, 154, 104682. [Google Scholar] [CrossRef]
- Badgujar, C.M.; Poulose, A.; Gan, H. Agricultural object detection with You Only Look Once (YOLO) Algorithm: A bibliometric and systematic literature review. Comput. Electron. Agric. 2024, 223, 109090. [Google Scholar] [CrossRef]
- Redmon, J. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2016. [Google Scholar]
- Terven, J.; Córdova-Esparza, D.-M.; Romero-González, J.-A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
- Sun, R.; Wang, S.; Li, M.; Zhu, Y. An algorithm for large-span flexible bridge pose estimation and multi-keypoint vibration displacement measurement. Measurement 2025, 240, 115582. [Google Scholar] [CrossRef]
- Yang, K.; Bao, Y.; Li, J.; Fan, T.; Tang, C. Deep learning-based YOLO for crack segmentation and measurement in metro tunnels. Autom. Constr. 2024, 168, 105818. [Google Scholar] [CrossRef]
- Casas, E.; Ramos, L.; Romero, C.; Rivas-Echeverría, F. A comparative study of YOLOv5 and YOLOv8 for corrosion segmentation tasks in metal surfaces. Array 2024, 22, 100351. [Google Scholar] [CrossRef]
- Kumar, P.; Batchu, S.; Swamy, S.N.; Kota, S.R. Real-Time Concrete Damage Detection Using Deep Learning for High Rise Structures. IEEE Access 2021, 9, 112312–112331. [Google Scholar] [CrossRef]
- Ruggieri, S.; Cardellicchio, A.; Nettis, A.; Renò, V.; Uva, G. Automatic detection of typical defects in reinforced concrete bridges via YOLOv5. Procedia Struct. Integr. 2024, 62, 129–136. [Google Scholar] [CrossRef]
- Marín-García, D.; Bienvenido-Huertas, D.; Carretero-Ayuso, M.J.; Torre, S.D. Deep learning model for automated detection of efflorescence and its possible treatment in images of brick facades. Autom. Constr. 2023, 145, 104658. [Google Scholar] [CrossRef]
- Cao, X.; Zou, H.; Li, J.; Ying, X.; He, S. OBBInst: Remote sensing instance segmentation with oriented bounding box supervision. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103717. [Google Scholar] [CrossRef]
- Wang, J.; Ding, J.; Guo, H.; Cheng, W.; Pan, T.; Yang, W. Mask OBB: A Semantic Attention-Based Mask Oriented Bounding Box Representation for Multi-Category Object Detection in Aerial Images. Remote Sens. 2019, 11, 2930. [Google Scholar] [CrossRef]
- Zhou, S.; Zhong, M.; Chai, X.; Zhang, N.; Zhang, Y.; Sun, Q.; Sun, T. Framework of rod-like crops sorting based on multi-object oriented detection and analysis. Comput. Electron. Agric. 2024, 216, 108516. [Google Scholar] [CrossRef]
- Gong, Y.; Zhang, F.; Jia, X.; Mao, Z.; Huang, X.; Li, D. Instance Segmentation in Very High Resolution Remote Sensing Imagery Based on Hard-to-Segment Instance Learning and Boundary Shape Analysis. Remote Sens. 2022, 14, 23. [Google Scholar] [CrossRef]
- Ma, J.; Shao, W.; Ye, H.; Wang, L.; Wang, H.; Zheng, Y.; Xue, X. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. IEEE Trans. Multimed. 2018, 20, 3111–3122. [Google Scholar] [CrossRef]
- Sabino, R.; Vidal, C.A.; Cavalcante-Neto, J.B.; Maia, J.G.R. Building Oriented Bounding Boxes by the intermediate use of ODOPs. Comput. Graph. 2023, 116, 251–261. [Google Scholar] [CrossRef]
- Yang, X.; Zhang, G.; Yang, X.; Zhou, Y.; Wang, W.; Tang, J.; He, T.; Yan, J. Detecting rotated objects as gaussian distributions and its 3-d generalization. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 4335–4354. [Google Scholar] [CrossRef]
- Qian, W.; Yang, X.; Peng, S.; Yan, J.; Guo, Y. Learning modulated loss for rotated object detection. In Proceedings of the AAAI Conference on Artificial Intelligence; IEEE: New York, NY, USA, 2021; pp. 2458–2466. [Google Scholar]
- Zhang, H.-S.; Fan, J.-S.; Duan, X.-J.; Nie, J.-G.; Huang, Y.-B.; Chai, T.-T.; Liu, Y.-F. Point cloud segmentation and construction verification for large-span modular steel structures. J. Constr. Steel Res. 2025, 226, 109288. [Google Scholar] [CrossRef]
- NP EN 1090-2:2020; Execution of Steel Structures and Aluminium Structures—Part 2: Technical Requirements for Steel Structures. Instituto Português da Qualidade (IPQ): Lisboa, Portugal, 2020.
- Leica Geosystems. Terrestrial Laser Scanning Solutions; Leica Geosystems: St. Gallen, Switzerland, 2024. [Google Scholar]
- Jolliffe, I.T. Principal Component Analysis; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
- Dwyer, B.; Nelson, J.; Hansen, T. Roboflow, version 1.0; Roboflow: Des Moines, IA, USA, 2024; Available online: https://roboflow.com (accessed on 12 January 2026).
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery: New York, NY, USA, 2019; pp. 2623–2631. [Google Scholar]
- Bradski, G. The OpenCV Library. Dr. Dobb’s J. Softw. Tools 2000, 25, 120–123. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.




