This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessCommunication
Two-Stage Marker Detection–Localization Network for Bridge-Erecting Machine Hoisting Alignment
by
Lei Li
Lei Li ,
Zelong Xiao
Zelong Xiao *
and
Taiyang Hu
Taiyang Hu
School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(17), 5604; https://doi.org/10.3390/s25175604 (registering DOI)
Submission received: 20 July 2025
/
Revised: 30 August 2025
/
Accepted: 1 September 2025
/
Published: 8 September 2025
Abstract
To tackle the challenges of complex construction environment interference (e.g., lighting variations, occlusion, and marker contamination) and the demand for high-precision alignment during the hoisting process of bridge-erecting machines, this paper presents a two-stage marker detection–localization network tailored to hoisting alignment. The proposed network adopts a “coarse detection–fine estimation” phased framework; the first stage employs a lightweight detection module, which integrates a dynamic hybrid backbone (DHB) and dynamic switching mechanism to efficiently filter background noise and generate coarse localization boxes of marker regions. Specifically, the DHB dynamically switches between convolutional and Transformer branches to handle features of varying complexity (using depthwise separable convolutions from MobileNetV3 for low-level geometric features and lightweight Transformer blocks for high-level semantic features). The second stage constructs a Transformer-based homography estimation module, which leverages multi-head self-attention to capture long-range dependencies between marker keypoints and the scene context. By integrating enhanced multi-scale feature interaction and position encoding (combining the absolute position and marker geometric priors), this module achieves the end-to-end learning of precise homography matrices between markers and hoisting equipment from the coarse localization boxes. To address data scarcity in construction scenes, a multi-dimensional data augmentation strategy is developed, including random homography transformation (simulating viewpoint changes), photometric augmentation (adjusting brightness, saturation, and contrast), and background blending with bounding box extraction. Experiments on a real bridge-erecting machine dataset demonstrate that the network achieves detection accuracy (mAP) of 97.8%, a homography estimation reprojection error of less than 1.2 mm, and a processing frame rate of 32 FPS. Compared with traditional single-stage CNN-based methods, it significantly improves the alignment precision and robustness in complex environments, offering reliable technical support for the precise control of automated hoisting in bridge-erecting machines.
Share and Cite
MDPI and ACS Style
Li, L.; Xiao, Z.; Hu, T.
Two-Stage Marker Detection–Localization Network for Bridge-Erecting Machine Hoisting Alignment. Sensors 2025, 25, 5604.
https://doi.org/10.3390/s25175604
AMA Style
Li L, Xiao Z, Hu T.
Two-Stage Marker Detection–Localization Network for Bridge-Erecting Machine Hoisting Alignment. Sensors. 2025; 25(17):5604.
https://doi.org/10.3390/s25175604
Chicago/Turabian Style
Li, Lei, Zelong Xiao, and Taiyang Hu.
2025. "Two-Stage Marker Detection–Localization Network for Bridge-Erecting Machine Hoisting Alignment" Sensors 25, no. 17: 5604.
https://doi.org/10.3390/s25175604
APA Style
Li, L., Xiao, Z., & Hu, T.
(2025). Two-Stage Marker Detection–Localization Network for Bridge-Erecting Machine Hoisting Alignment. Sensors, 25(17), 5604.
https://doi.org/10.3390/s25175604
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.