Intelligent Lifting Systems Based on Digital Operators, Conductors and Supervisors

Zhou, Rui; Miao, Yuanrong; Chen, Yufeng

doi:10.3390/app16094270

Open AccessArticle

Intelligent Lifting Systems Based on Digital Operators, Conductors and Supervisors

by

Rui Zhou

,

Yuanrong Miao

and

Yufeng Chen

^*

The Institute of Systems Engineering, Macau University of Science and Technology, Taipa, Macau SAR, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(9), 4270; https://doi.org/10.3390/app16094270

Submission received: 6 March 2026 / Revised: 12 April 2026 / Accepted: 20 April 2026 / Published: 27 April 2026

(This article belongs to the Special Issue Data-Driven Digital Twin for Smart Manufacturing and Industry 4.0)

Download

Browse Figures

Versions Notes

Abstract

Traditional lifting operations rely heavily on manual experience, which often leads to high operational risks and limited efficiency. To address these issues, this paper proposes an intelligent lifting system with digital operators, conductors, and supervisors, to improve safety and efficiency through multi-agent collaboration. The system uses a BEVFusion-based perception module to support target detection and collision warning during lifting operations. To handle unforeseen situations, a dynamic local lifting path planning method is designed to ensure safe lifting operations. Rather than proposing a fundamentally new algorithm, this study focuses on integrating perception and planning within a unified intelligent lifting system. The experimental results show that the system can support safe lifting operations under the tested conditions and demonstrate its feasibility in practical scenarios.

Keywords:

artificial scene construction; BEVFusion; intelligent lifting system; lifting path planning; smart cranes

1. Introduction

Cranes are widely used in the construction and maintenance of buildings and infrastructure. Their stable operation directly affects the efficiency and safety of engineering projects. Lifting operations, which usually involve heavy loads and high risks, are generally performed by human workers. If the task is planned or executed improperly, it poses a significant risk of crane accidents [1,2,3].

In high-risk scenarios such as post-earthquake reconstruction and fire rescue, safety risks become even more significant [4]. Reducing reliance on manual operations while maintaining safety remains a key challenge. With recent advances in mechatronics and artificial intelligence, intelligent and even autonomous lifting technologies are becoming promising solutions to enhance operational efficiency and safety.

Intelligent lifting systems rely on several key technologies, one of which is environmental perception. The large size of cranes and the rotational motion of the boom and control room create a highly complex working space during lifting operations. This complexity increases collision risks. Accurate environmental perception is essential for crane operation planning [5,6,7,8]. Recently, three-dimensional object detection in a shared bird’s-eye-view (BEV) representation space based on multi-view sensors has attracted increasing attention in the field of autonomous driving [9,10,11]. The primary advantage of BEV perception is that it provides vehicles with a highly comprehensive global perspective, which helps reduce occlusion and improve spatial awareness. In this paper, the BEVFusion method [12] is introduced for the three-dimensional object perception of intelligent cranes. The performance of deep learning models relies on vast amounts of labeled data [13]. In many cases, due to the difficulties in data collection and annotation, researchers cannot obtain sufficient training datasets from real scenarios, which hinders the design and evaluation of algorithms. While these methods provide strong perception capabilities, they are typically developed as standalone modules without considering the interaction with lifting planning in dynamic environments.

Lifting planning is another critical component in intelligent lifting technology, which generates a safe and collision-free path for the cranes to reach the specified operating position [14,15,16]. The planning process of an intelligent lifting system (ILS) usually includes travel path planning and lifting path planning [17,18,19,20]. According to the different planning levels, the relevant algorithms can be divided into two stages, global planning and local planning, namely global travel path and global lifting path planning, as well as the corresponding local travel path and local lifting path planning [21,22]. During the global planning stage, crane states and the global path are determined through a digital map and localization system. In the local planning stage, the system combines the global planning results with crane-state information and real-time sensor data to dynamically adjust and optimize the local path. Local path planning requires the system to possess real-time perception and path search capabilities in dynamic lifting environments, which remains a significant challenge in current research [23]. In practical applications, lifting operations are typically coordinated by project managers or experienced operators [24,25]. However, in dynamic environments, manual coordination often struggles to respond quickly and accurately, compromising operational efficiency and safety.

In recent years, a variety of intelligent algorithms have been developed for crane lifting planning [26,27,28,29,30], such as fast rapidly exploring random tree algorithms, heuristic algorithms, and graph search algorithms. These methods have made significant progress in improving the automation and optimization of path planning. Recent studies show that high-quality path planning should account not only for reachability, but also for task safety, operational convenience, and visibility. Among these, energy efficiency has gradually become a critical issue in crane operations, and related research still requires further exploration [24,27]. Zhang et al. [31] proposed an automated three-dimensional crane path planning method for multi-objective optimization, which uses an improved A* algorithm and comprehensively considers multiple factors to achieve optimal crane path planning. However, the above methods are primarily suited for global lifting path planning and struggle to adapt to dynamic changes in environmental and system conditions. Therefore, local path planning is necessary. This approach integrates obstacle information from the perception system and real-time utilized capacity percentage data from the crane to dynamically adjust the planned path, ensuring lifting operation safety. For instance, when the total load weight approaches the percentage threshold of the crane manufacturer’s specified rated capacity (such as 80% or 90%), the safety risks associated with the lifting operation increase significantly. Existing studies have addressed perception, path planning, and crane operation from different perspectives. However, these components are often developed and evaluated independently, and their interaction under dynamic and uncertain lifting conditions is not fully considered. This highlights the lack of system-level approaches that can effectively coordinate perception and planning for real-time lifting operations.

To address these challenges, we propose a novel ILS to improve the efficiency and safety of crane lifting operations. The system is designed for dynamic operation scenarios. By collaboratively integrating the perception, decision-making and execution processes, it achieves efficient coordination between safety constraints and task execution. The main contributions of this work lie in the integration of perception, planning, and execution at the system level, rather than in proposing a standalone new algorithm. The contributions are summarized as follows:

(1): To address the intelligent requirements of lifting operations, the concept and framework of an ILS are proposed. The system employs a crane–cloud integration approach. Through the collaboration between the cloud platform and the crane platform, it realizes the construction of manual scenarios and intelligent lifting task planning. Digital supervisors, digital conductors, and digital operators are also introduced to replace or enhance the functions of human supervisors, conductors, and operators in traditional lifting operations.
(2): A BEV-based environmental perception framework is developed to support real-time perception and dynamic obstacle detection in the ILS. In addition, a digital twin-based virtual working environment is constructed and used for data augmentation and offline simulation, enabling the perception system to better cope with the complex and changeable operating environment.
(3): A planning framework is introduced to support local dynamic path planning. Based on real-time environmental information and the real-time crane states, the framework can adjust the crane path online. This ensures collision avoidance and the safe operation of lifting tasks under capacity constraints.

In this context, this study aims to explore how perception, digital twin modeling, and adaptive path planning can be effectively combined to support safe and adaptive lifting operations in dynamic environments. Unlike standalone algorithmic studies, the contribution of this work lies in the system-level coordination of perception, planning, and digital supervision under dynamic lifting conditions. In practical lifting operations, these components are often developed independently, and their interaction is not explicitly modeled. This work provides a structured integration framework that enables consistent information flow and decision-making across modules, which is critical for safety-critical lifting tasks.

2. Intelligent Lifting System

This section presents the overall system architecture and its functional components. The focus here is on how different modules are organized and interact within the system, while detailed algorithmic methods are introduced in Section 3.

Intelligent Lifting System Framework

A traditional lifting task usually requires the collaboration of a supervisor, a conductor, and an operator. The supervisor selects a suitable crane according to the task requirements and determines the position of the crane by jointly considering various factors such as lifting safety and efficiency before the operation starts. During the lifting process, the conductor directs the operation and guides the crane operator through hand signals, while the operator remains in the control room and operates the crane according to the received instructions.

As shown in Figure 1, the core goal of the ILS is to replace or augment the roles traditionally performed by humans in traditional lifting operations as much as possible through intelligent methods. The system comprises three collaborative digital agents: the digital operator, the digital conductor, and the digital supervisor.

The digital supervisor mainly undertakes the responsibilities of a traditional human supervisor, including selecting cranes according to task requirements and determining their standing areas. In addition, the digital supervisor also performs global traveling and lift path planning, providing guidance for crane operations. The planned paths can also be used to verify the results of crane selection and standing area determination, which reduces safety risks caused by inappropriate crane selection or ill-conceived operational zone arrangements. The digital conductor enhances the ability to perceive and guide the environment through a variety of sensors deployed on the cranes. It can enable 360-degree environmental monitoring and support collision warning, dynamic local path planning, and dynamic lifting path planning. After the digital conductor issues instructions, the digital operators translate them into executable operation sequences and control the crane to perform corresponding actions. In the current system, the crane executes the planning results in an open-loop approach according to the predefined action sequences. The human operators are always involved in operation monitoring and can take over control of the crane in emergency situations to ensure operational safety. The framework of the ILS is illustrated in Figure 2 (Blue indicates the Digital Supervisor, orange indicates the Digital Conductor, and green indicates the Digital Operator).

(1): Digital supervisor:

The digital supervisor monitors the operating states of the ILS from a global system perspective. Its main task is to collect diverse information and build artificial scenarios on the cloud platform for global offline simulation of lifting tasks. The artificial scenarios are three-dimensional environmental models constructed based on drone photogrammetry techniques. Within these three-dimension models, various obstacles and weather conditions can be simulated to generate diverse offline experimental scenarios. This can not only continuously enhance the ability to deal with unexpected situations but also improve the safety of the ILS. The artificial scenarios also support dynamic updates based on the real working environment. When digital conductors perceive new obstacle information during actual operations, the system generates the corresponding model at the appropriate position in the virtual environment. This achieves consistency between the virtual scene and the real environment. Additionally, the digital supervisor has a collision detection function, which can calculate the distance between the crane and the surrounding environment in real time and issue an alarm signal based on a predefined distance threshold. The monitor can also be remotely taken over, permitting manual intervention in certain emergency situations to ensure operational safety. From a system-level perspective, the following descriptions outline the key functional modules are included in the digital supervisor:

Standing area decision-making is primarily used to determine the positioning points for crane operations. It identifies the area in which a crane can operate by considering parameters such as the crane’s boom length and minimum and maximum radius, while eliminating positions that would lead to collisions with the surrounding environment during crane operations.

Global decision-making and trajectory planning comprises global traveling path planning and global lifting path planning, which generate the global traveling path and lifting path based on the 3D point cloud map and object starting and ending coordinates, as well as crane starting and ending coordinates. The digital supervisor can create diverse and rich scenarios for conducting offline simulation experiments on global decision-making and trajectory planning algorithms, thereby verifying and optimizing the algorithm’s performance.

Perception model training is accomplished through vast datasets, with images collected from the real world and the artificial scene. The training process aims to obtain a high-performance detection model applicable to various lifting scenarios in both real and virtual environments. Subsequently, the detection model is meticulously evaluated and validated in the virtual world using a large dataset with comprehensive labeling. Once a well-trained and thoroughly evaluated detection model is established, it can be applied to real-world settings for real-time detection assessments. However, constrained by practical factors such as safety risks, human resources, and energy consumption, the number of experiments that can be carried out in real environments is still limited, which restricts the comprehensive study and solution of various complex problems in lifting operations. Conversely, experiments conducted in the virtual world offer a safer and more cost-effective alternative. Furthermore, the digital supervisor allows for simulation of a broader range of scenarios, including those less likely to occur in the real world.

(2): Digital conductor:

The digital conductor is responsible for monitoring system parameters and indicators and then issuing reliable warnings and making decisions based on real-time environmental changes. The digital conductor receives the global data provided by the digital supervisor and continuously monitors external obstacle information and the crane’s own status in real time during the lifting process to fulfill local path planning and operational path planning. The following modules describe how perception and planning are integrated at the system level in the digital conductor:

Perception based on BEVFusion uses various sensors (such as LiDAR and cameras) to perceive environmental information. The perception model is acquired through model updates by the digital supervisor. The digital conductor obtains real-world sensor data and uses the model to infer object and obstacle categories and positions. This information is used in two ways: first, it is provided to the digital supervisor for real-time updating of obstacles and objects within the three-dimensional scene, and second, it is used for local decision-making and trajectory planning for real-time path adjustments. Based on obstacle information, the digital conductor can provide early warnings of potential collisions between the crane and obstacles, issuing real-time alerts in both the real world and the virtual world.

Local decision-making and trajectory planning comprise local traveling and lifting path planning. Based on the global traveling path planning path and global lifting path provided by the digital supervisor, the digital conductor adjusts the local path in real time by incorporating perceived obstacle information and the crane’s status. The resulting adjusted path is then communicated to the digital operator. The local traveling path from the digital conductor includes the crane’s motion coordinates, speed, and acceleration among other parameters. The local lifting path includes trajectory points, speed, and related data for luffing, swinging, and hoisting.

(3): Digital operator:

The digital operator, in the real world, controls the crane to execute the corresponding paths provided by the digital conductor and updates the real-time status of the artificial crane in the virtual world. Through the digital operator, the crane can accurately track the working trajectory and precisely complete the hoisting task. Based on the above system architecture, the ILS relies on real-time environmental perception and safety-oriented motion planning to ensure reliable execution under dynamic and uncertain operating conditions.

3. Key Technologies of Intelligent Lifting System

The key technologies of the ILS are outlined in this section, including perception based on BEVFusion and dynamic local lifting path planning.

3.1. Perception Based on BEVFusion

The focus is not only on mathematical formulation, but also on how these components are implemented and interact within the system. In this work, the BEV-based perception module is treated as a supporting sensing component of the ILS, rather than a standalone research contribution. The sensors used in the perception system include cameras and LiDARs. By combining the two modalities, this improves the detection accuracy of object detection and recognition as well as gains a more thorough and reliable picture of the environment.

To fuse the data from multiple sensors, the BEVFusion algorithm first generates a unified BEV representation of the environment. This BEV representation is then used to train the multi-task neural network, which predicts object labels, instance masks, and semantic labels simultaneously. In situations where one or more sensors are partially or fully obscured, the method has shown stable performance, underscoring the significance of multi-sensor fusion for reliable perception in applications such as intelligent lifting and autonomous driving.

To address the difficulty of collecting and annotating large-scale and diverse datasets, artificial scenes are used to generate large-scale labeled lifting-scene data, which are combined with limited real-world driving images to train a powerful BEVFusion detector in the offline simulation experiment module. Aside from various kinds of static and dynamic objects, the season, weather, and light are also composed in an artificial scene. The 3D models of static and dynamic objects can be created by tools such as Unity3D (Unity 2022.3.62f2c1). Dynamic objects such as vehicles and pedestrians can be instantiated with obstacle-avoidance capabilities and their paths can be generated for analysis. Specifically, the artificial lifting environment requires three-dimensional reconstruction through tilt photography technology. When using tilt photography, the appearance, location, height, and other attributes of features can be accurately captured. Unmanned aerial vehicles make it possible to swiftly gather image data for completely automated three-dimensional modeling. The collected data from the tilt photography of unmanned aerial vehicles can generate map files in various formats after processing, in which the real-world model is used for three-dimensional scene display and the point cloud map is used in the algorithm module. The proper simulation environment of ILSs can also provide strong support for the following functions, such as standing area selection and path planning.

The offline simulation experiment process also contains two sub-procedures, namely, learning and training, as well as testing and evaluation. The use of large-scale datasets greatly enhances the efficacy of model training, while an abundant supply of near-real data substantially facilitates model evaluation. Finally, the object detection model is trained and assessed simultaneously in both the real world and the virtual environment. The object detector can be optimized online based on its performance in the two digital twin-based virtual environments. Figure 3 shows a simplified architecture of the perception system of crane lifting. The current system achieves model updates in the real world through an offline process. The capability for online updates will be investigated and implemented in future work. The model is trained using standard supervised learning procedures with predefined training and validation splits. The perception outputs are continuously provided to the local planning module, serving as the primary trigger for dynamic replanning and collision avoidance. The replanning process is triggered when environmental changes exceed predefined thresholds, such as obstacle intrusion or safety margin violations.

3.2. Dynamic Local Lifting Path Planning

From an implementation perspective, the proposed dynamic local lifting planning process can be summarized as follows. First, environmental information is obtained from the perception module and mapped into the configuration space. Then, a global path is generated based on predefined constraints. During execution, real-time perception updates are used to detect unexpected changes, and the local path is dynamically adjusted by incorporating obstacle information and torque limiter constraints. This process ensures that both safety and feasibility are considered throughout the lifting operation. This work focuses on extending existing path planning methods as an application-oriented approach by incorporating real-time perception feedback and capacity-aware replanning. The current lifting planning algorithms mainly focus on global lifting path planning, which involves determining a lift path considering multiple factors such as three-dimensional maps, the weight and dimensions of the load, etc.

However, the above-mentioned approach performs lifting path planning under the assumption of a predetermined three-dimensional scene. In actual lifting operations, it is necessary to consider unforeseen circumstances both externally and internally to the crane, which require dynamic adjustments to the lifting path, i.e., local lifting path planning. These unexpected scenarios may include the sudden appearance of obstacles not previously present in the three-dimensional scene or the crane’s payload approaching the capacity percentage threshold (e.g., 80% or 90%) in certain orientations, which could potentially lead to crane collisions or tipping incidents. Table 1 compares the factors considered in global lifting path planning and dynamic local lifting path planning.

In order to address the limitations of global path planning, an improved dynamic local lifting planning method, which considers unforeseen circumstances both externally and internally to the crane, is proposed based on the automatic three-dimensional (3D) lift planning method [31].

By deliberately introducing the energy and time consumption models, the energy and time consumed by various movements can be quantified and analyzed. The objective function based on the energy and time consumption models naturally accounts for energy and time efficiency. A theorem is proposed to determine the visibility of the planned path, which is involved in the time consumption model.

The position posture of the mobile crane can be described by the variables of three dimensions which correspond to the three forms of movement of the crane—luff, swing, and lift—which are collected in

C = {θ, σ, r}

, where

θ

is the luffing angle,

σ

is the swing angle and

r

is the sling length. Consequently, the description of the payload position has the form of

C = {θ, σ, r}

for the crane posture configuration, as shown in Figure 4. The two descriptions (Cartesian coordinate system and C-Space system) of the payload’s position can be transformed into each other [31].

Moreover, collision avoidance must be considered as a lifting constraint. As shown in Figure 5, a collision may occur between the boom (the brown straight line in the Figure 5a), obstacle (the blue rectangle in the Figure 5a)), and payload(the green rectangle in the Figure 5a) during the crane lifting operation (The position of the star in the Figure 5a represents the collision point). Specifically, collisions may occur between the payload and the boom, between obstacles and the boom, and between obstacles and the payload. The constraint of the lifting path [31] can be expressed by Equation (1). During task planning, the luffing angle

θ

must be set to ensure the equipment maintains a safe clearance from the nearest obstacle to prevent collision. Concurrently, the hoisting length of the sling is subject to dual constraints. Its upward movement must avoid collision between obstacles and boom, while downward movement requires preventing the load or sling from colliding with surrounding obstacles. These regions are represented as non-passable areas in the C-Space map, In Figure 5b, points 1 and 2 denote the critical points of the collision boundary. The purple and green arcs are generated from these critical points to characterize the configuration boundaries corresponding to critical contact between the boom and the obstacle, thereby separating the feasible region from the infeasible region in the C-Space. The brown arc denotes the trajectory of the boom tip for a fixed boom length

L_{b}

.

(θ, σ) = \{\begin{matrix} r (θ) > R_{l} \cdot t a n θ + H_{l} / 2 \\ θ (σ) > m a x (a r c t a n (H_{o i} (σ)) / D_{o i} (σ)) \\ r (θ, σ) < L_{b} s i n θ - H_{o} (θ, σ) - H_{l} / 2 \end{matrix}

(1)

where

L_{b}

is the length of the boom,

R_{l}

and

H_{l}

are the radius and height of the payload,

r (θ)

is the distance between the center of the payload and the boom tip,

H_{o i} (σ)

represents the height of the obstacle,

D_{o i} (σ)

indicates the obstacle’s distance to the luffing center and

H_{o} (θ, σ)

is the height of the top of the obstacle. The luffing angle θ determines the horizontal position of the load

R_{l} = L_{b} c o s θ

and the vertical reference height

L_{b} s i n θ

. As

θ

increases, the load retracts toward the luffing center, and the vertical reference height is elevated. The hoist rope length

r (θ, σ)

directly modifies the load’s centroid height:

L_{b} s i n θ - r (θ, σ)

. Retracting the rope (decreasing r) raises the load height, while extending the rope (increasing

r

) lowers it. The load outline

r_{l} / r_{2}

expands the collision risk boundary, further contracting the actual safe operating space geometrically. Increasing

θ

raises the operating reference height and reduces the operating radius, enabling the crossing over of higher obstacles but restricting the horizontal operating range; shortening

r (θ, σ)

elevates the load height but increases the risk of load–boom collision. An increase in obstacle height

H_{o}

or a decrease in distance

D_{o}

simultaneously tightens the feasible range of

θ

and

r (θ, σ)

, compressing the safe operating window. Due to the study in [31], these constraint conditions are initially computed only once.

Each grid in the A* algorithm is a node that is either free or occupied by obstacles when searching in a grid map cluttered with obstacles. As shown in Equations (2)–(7), the cost function for the

k

-th node includes the actual cost from the start to the

k

-th node, and the cost of a heuristic estimate to the goal. We employ the method proposed in [31] to compute the actual cost

g (k)

and the energy–time heuristic

h (k)

, i.e.,

J (k) = h (k) + g (k)

(2)

where

\begin{array}{l} h (k) & = \frac{1 + s i g n (θ_{g} - θ_{k})}{2} k_{1} (s i n θ_{g} - s i n θ_{k}) \\ + k_{2} | σ_{k} - σ_{g} | c o s (\max (θ_{k}, θ_{g})) \\ + \frac{1 + s i g n (r_{k} - r_{g})}{2} k_{3} (r_{k} - r_{g}) \\ + \frac{|θ_{k} - θ_{g}|}{{\dot{θ}}_{steady}} + \frac{|σ_{g} - σ_{k}|}{{\dot{σ}}_{steady}} + \frac{|r_{g} - r_{k}|}{{\dot{r}}_{steady}} \end{array}

(3)

g (k) = g (k - 1) + c_{k, k - 1} = c_{k, k - 1} + c_{k - 1, k - 2} + . . . + c_{1, s}

(4)

where

c_{k, k - 1}

represents the actual cost incurred in moving from the

(k - 1)

-th node to the

k

-th node, and the start node

g (s) = 0

.

\begin{array}{l} c_{k, k - 1} & = α_{e} \frac{{F_{e} (k, k - 1) + E}_{θ} + E_{σ} + E_{r}}{F_{e} (g, s)} \\ + α_{t} \frac{F_{t} (k, k - 1)}{F_{t} (g, s)} \end{array}

(5)

\begin{array}{l} F_{e} (x, y) & = \frac{1 + s i g n (θ_{x} - θ_{y})}{2} k_{1} (\sin θ_{x} - \sin θ_{y}) \\ + k_{2} |σ_{y} - σ_{x}| \cos (\max (θ_{y}, θ_{x})) \\ + \frac{1 + s i g n (r_{y} - r_{x})}{2} k_{3} (r_{y} - r_{x}) \end{array}

(6)

\begin{array}{l} F_{t} (x, y) & = \frac{| θ_{y} - θ_{x} |}{{\dot{θ}}_{steady}} + \frac{| σ_{x} - σ_{y} |}{{\dot{σ}}_{steady}} + \frac{| r_{x} - r_{y} |}{{\dot{r}}_{steady}} \end{array}

(7)

However, the above-mentioned cost function does not consider unforeseen circumstances both externally and internally to the crane during the lifting process, such as the sudden appearance of obstacles and the varying overall weight of the payload at different luffing angles.

In the improved algorithm, during the crane lifting process, if new obstacles are detected, the digital conductor re-evaluates the constraint conditions

C (θ, σ)

. If the current crane’s operational state

C' (θ, σ)

satisfies these constraints

C (θ, σ)

, it computes the local lifting path based on the updated C-Space map. However, if the current crane’s operational state

C' (θ, σ)

does not meet the constraint conditions

C (θ, σ)

, the digital conductor synchronously issues an alert in both the real world and the virtual world, signaling the need for crane experts to take over the system and halt the lifting task.

The luffing angle can be confined within a safe range in global lifting planning. However, in the actual lifting process, affected by various factors such as changes in slewing speed, angle measurement errors, and deviations in load weight, the overall weight of the payload may still be close to or even exceed the rated load percentage threshold of the crane, which increases the risk of accidents. To address this issue, we dynamically acquire the load corresponding to the current crane posture through the torque limiter and adjust the operational path accordingly to enhance crane safety. We also introduce a condition to the cost function: when the overall weight of the payload exceeds 80% of the crane’s capacity percentage threshold, that point is assigned infinite cost, thereby rendering it impassable.

To further integrate energy efficiency, execution time, and torque limiter safety conditions into a unified planning framework, we redesign the cost function of the A* algorithm. Instead of enforcing safety constraints only through mandatory stopping conditions, the proposed method embeds safety-related penalties into the path evaluation process, enabling the planner to proactively avoid high-risk configurations. To improve the computational efficiency of the improved A* algorithm, the parameter

r_{m}

is set to the minimum sling length permitting the hoisting path. This transforms the search graph from a three-dimensional

(θ, σ, r)

configuration space to a two-dimensional

(θ, σ)

space. As shown in Equation (8), through this series of simplifications, a novel cost function based on explicit models of energy consumption and time consumption is derived as

J = g (k) + h (k) \begin{matrix} = G + α_{e} \frac{k_{3} (r_{s} - r_{m}) + E_{r}}{F_{e} (g, s)} + α_{t} \frac{(|r_{s} - r_{m}| + |r_{g} - r_{m}|) / r_{s t}}{F_{t} (g, s)} \end{matrix}

(8)

where

J

denotes the total cost of the three-dimensional lifting path planning in the

A^{*}

algorithm, and

G

represents the actual cost of the optimal sub-path. The parameters

α_{e}

and

α_{t}

are the weighting coefficients for energy consumption and time consumption, respectively. The terms

F_{e} (g, s)

and

F_{t} (g, s)

correspond to the theoretical minimum energy consumption and the theoretical minimum execution time between the start and end points. The variable

E_{r}

denotes the kinetic energy loss associated with a single hoisting operation. The rope lengths at the start and end points are denoted by

r_{s}

and

r_{g}

, respectively, and

r_{m}

represents the minimum rope length selected during the planning process. In addition, the parameter

k_{3}

denotes the potential energy coefficient per unit length along the rope direction, and

r_{s t}

represents the steady-state velocity of the hoisting rope [31].

As shown in Equations (9) and (10), to explicitly account for torque limiter-based safety during path planning, an additional safety term is introduced, leading to the augmented cost function.

\begin{array}{l} h^{'} (k) = & h (k) + α_{s} \frac{\sum_{k \in path} {[m a x (0, u (q_{k}) - u_{t h})]}^{2}}{F_{s, h}} \\ = & α_{e} \frac{k (r_{s} - r_{m}) + E_{r}}{F_{e} (g, s)} + α_{t} \frac{(|r_{s} - r_{m}| + |r_{g} - r_{m}|) / r_{s t}}{F_{t} (g, s)} \\ + α_{s} \frac{\sum_{k \in path} {[m a x (0, u (q_{k}) - u_{t h})]}^{2}}{F_{s, h}} \end{array}

(9)

J^{'} = g (k) + h^{'} (k)

(10)

where

h^{'} (k)

denotes the total cost of the three-dimensional lifting path planning with the torque limiter incorporated. The parameter

α_{s}

is the safety weighting coefficient, which together with

α_{e}

and

α_{t}

constitutes the set of weighting coefficients. The weighting parameters are selected based on empirical tuning to balance safety constraints and operational efficiency. The parameter

u_{t h}

denotes the torque limiter threshold coefficient and

F_{s, h}

(a constant) represents the upper bound of the safety cost along the path length under the maximum possible degree of torque limit violation. As shown in Equation (11), the term

u (q_{k})

denotes the torque limiter utilization at a specific point along the optimal sub-path, defined as

u (q_{k}) = \frac{m}{m_{m a x} (θ, σ, L_{b})}

(11)

where

m

is the current payload and

m_{m a x} (θ, σ, L_{b})

is the maximum allowable payload for the given configuration, obtained from a lookup table.

Let

p (k)

be the weight of the payload at the

k

-th node and

\bar{p} (k)

be the average overall weight of the payload from the starting point to the

k

-th node. If

p (k - 1) \geq 0.8 \bar{p} (k - 1)

and

θ (k) < θ (k - 1)

, then

T (k) = \infty

. Therefore, the improved method

h (k)

can be expressed as Equation (12):

h^{'} (k) = \{\begin{array}{l} h^{s} (k), & if p (k - 1) < 0.8 \overline{p} (k - 1) \\ h^{s} (k), & if p (k - 1) \geq 0.8 \overline{p} (k - 1) \\ and θ (k) \geq θ (k - 1) \\ \infty, & if p (k - 1) \geq 0.8 \overline{p} (k - 1) \\ and θ (k) < θ (k - 1) \end{array}

(12)

where

h^{s} (k) = h (k) + α_{s} \frac{\sum_{k \in path} {[m a x (0, u (q_{k}) - u_{t h})]}^{2}}{F_{s, h}}

is calculated according to Equation (9).

Furthermore, the time required by the A* algorithm grows exponentially with the dimension. Therefore, dimension reduction is an effective method to significantly decrease searching time. Finally, an improved A* algorithm, which considers unforeseen circumstances both externally and internally to the crane, is proposed based on its characteristic ability to reduce the searching map from 3D into two-dimensional. The overall workflow integrates perception, global planning, and local replanning in a sequential manner, enabling adaptive responses to dynamic changes during lifting operations. The implementation procedure of the proposed planning method is summarized in the following pseudo code in Table 2.

4. System Realization

This section introduces the system implementation and practical applications of the ILS. Experimental crane lift tasks are conducted to evaluate the ILS’s performance. The tasks are executed by a boom crane. The crane features a basic boom length of 11.6 m, a full extension boom length of 45 m, and a maximum lifting capacity of 50 tons. The steady velocities for luffing

{\dot{θ}}_{s t e a d y}

, swing

{\dot{σ}}_{s t e a d y}

, and hoisting

{\dot{r}}_{s t e a d y}

are

{0.5}^{°} / s

,

{0.5}^{°} / s

, and

1 m / s

, respectively. All experiments are conducted under controlled conditions and should be interpreted as feasibility-oriented validation. This section presents the system realization based on the key technologies of the ILS. To reduce potential systematic bias, the experimental scenarios are varied in terms of initial crane positions, obstacle configurations, and operating conditions, rather than relying on a single fixed setup.

4.1. Perception Based on BEVFusion

The following experimental results are provided to verify the reliability and practical feasibility of the adopted perception module, rather than to claim algorithmic superiority over existing BEV-based methods. The results should therefore be interpreted as a feasibility demonstration under the current experimental setup. The datasets of 100 crane operation scenarios were assembled, with information on lighting, weather, locations, entry routes, and working conditions. The datasets include point cloud and panoramic image data obtained through LiDAR (RS-Bpearl and RS-Helios-5515, RoboSense Technology Co., Ltd., Shenzhen, China) and six cameras (SG2-IMX390C-5200-GMSL2-H120L, Shenzhen SENSING Technology Co., Ltd., Shenzhen, China) installed on the crane. The sensor installation layout is shown in Figure 6 (The boxes represent the installation points).

Data are collected during the conditions of sufficient light (morning period) and insufficient light (afternoon period), the examples of which are presented in Figure 7. The annotation is performed manually with precise labeling applied to point cloud data and their corresponding images. To ensure the standardization and reusability of the annotation results, all outputs are uniformly stored in JSON format. In this format, the annotation tool version, object class label, annotation region coordinates, annotation shape type, image path, image height, and image width are stored. After collaboratively annotating the point cloud and panoramic image data, we divide the datasets into training, validation, and testing sets with a ratio of 7:1:2.

The main object categories include cranes, pedestrians, and typical site obstacles. The perception model is trained on a workstation equipped with GPU acceleration. Input data are preprocessed to ensure consistent spatial resolution and coordinate alignment. Standard training settings, including batch size, learning rate, and training epochs, are used to ensure stable convergence. It should be noted that the purpose of this module is to support system-level feasibility rather than to achieve state-of-the-art detection performance. We use mean Average Precision (mAP) to assess the performance of the BEVFusion algorithm, measuring the average detection accuracy of all object classes based on their projected positions on the ground plane. Table 3 presents the test mAP of the BEVFusion algorithm for different input image sizes and voxel sizes of the point cloud.

The results show that the BEVFusion algorithm exhibits significant advantages in fusing point cloud data and panoramic images for target detection and localization tasks. Under the conditions of a higher input image size and a smaller voxel size, the method achieves a better result. This is mainly because a higher image size can provide more detailed environmental and target information, while a smaller voxel size helps with detail preservation in the point cloud data.

The results of object detection and localization are illustrated in Figure 8. The brown boxes and yellow boxes represent cranes and pedestrians, respectively. The central black image shows point cloud data obtained by a LiDAR sensor installed on the crane. The six images around it are obtained from six cameras distributed around the crane, providing 360-degree perception of the crane’s surroundings.

3D artificial scenes are established using unmanned aerial vehicle (UAV) oblique photography techniques and imported into the PILS software (Embedded Target V6.07.00; Renesas Electronics, Tokyo, Japan) developed in Unity3D. Within the PILS software, cranes can be operated manually, and various obstacles can be arranged. Multiple objects are created, including containers, barrels, flames, and humans, as shown in Figure 9. The variety of objects will be further expanded, and their realism will be optimized in future work. By importing these objects, diverse artificial scenes can be constructed, which can not only generate extensive datasets for algorithm training but also enables the testing of algorithm performance in various scenarios. The perception module is designed to provide reliable 3D environmental information for the proposed 3D lifting path planning algorithm. The perception component adopts a mature framework with standard configurations, rather than a novel design.

To address the difficulty in collecting and annotating large-scale diversified datasets, by using the artificial scenes, we supplemented the datasets with images and laser point cloud data collected from an artificial crane to enhance the BEVFusion algorithm’s performance. A portion of the data is shown in Figure 10, where the red points represent simulated laser point cloud data. The algorithm’s AP improved with the addition of data from artificial scenes. Subsequent supplementation of the datasets will further enhance the algorithm’s performance. It should be noted that the current implementation relies on offline perception updates and predefined processing steps, which may limit real-time adaptability in dynamic environments.

4.2. Dynamic Local Lifting Path Planning

To provide a more structured evaluation, the performance of the proposed system is analyzed from three aspects: (1) collision avoidance capability, (2) adaptability to dynamic environmental changes, and (3) response to safety constraints such as torque limiter thresholds. These metrics are represented by the minimum safety distance for collision avoidance, the trajectory adjustment behavior under disturbances for adaptability to dynamic changes, and the system’s response to torque limiter constraints for safety compliance. The evaluation is conducted across both simulated and real-world scenarios to assess the consistency of system behavior under different operating conditions. Quantitative indicators such as minimum safety distance, path adjustment behavior, and response to load constraints are used to support the analysis. The performance of the dynamic local lifting path planning algorithm is studied. The experiments are first thoroughly validated in the artificial scenes to ensure safety before being tested in the real world, which is also one of the advantages of the ILS. By setting the hoist rope length r, the three-dimensional configuration space

(θ, σ, r)

is simplified to a two-dimensional space

(θ, σ)

. Based on these parameters, a cost function considering time, energy, and torque constraints is constructed. The experimental parameters are determined comprehensively according to practical crane operating conditions and the existing literature on crane motion planning and collision avoidance [31]. The parameter ranges cover typical lifting scenarios in real construction, making the setup representative and consistent with related studies in the field.

To validate the performance of the algorithm in handling the sudden intrusion of obstacles, we construct 3D construction site scenes. The global lifting path is calculated by setting the payload starting point at coordinates

(45.98, - 42.48, - 98.49)

, payload destination at coordinates

(43.88, - 57.12, - 98.54)

, and crane at coordinates

(51.51, - 50.72, - 99.58)

.

Using the simulation system, a pedestrian movement animation is triggered at the start of hoisting, creating a scenario in which a pedestrian enters and remains on the lifting path before the payload arrives. All simulation experiments are repeated five times to ensure stable and reproducible results. During the experiments, the experimental parameters including the hoisting start point, destination point, crane position, and pedestrian model trajectory remain fully consistent. This ensures that the evaluation of the proposed algorithm is not affected by fixed experimental conditions, which helps improve the reliability of the results. The experimental results are shown in Figure 11 and Figure 12.

From Figure 11a and Figure 12a, it can be observed that the global path only includes rotation action. If the lifting task is executed according to the global lifting path, there is a high probability of collision between the payload and pedestrian, which is not permissible. From Figure 11b and Figure 12b, when pedestrian intrusion is detected, the dynamic local lifting planning algorithm regenerates the lifting path to avoid the pedestrian, which includes both swing and hoisting actions. Figure 13 shows that the minimum distance between the payload and pedestrian is greater than 2 m after dynamic replanning. These results demonstrate the system’s ability to avoid collisions under dynamic intrusion scenarios by actively adjusting the lifting path in response to real-time perception inputs under tested conditions.

After validating the algorithm’s performance in artificial scenes, we implement the algorithm in a real crane for further validation. This setup is intended for controlled functional verification rather than full validation under real lifting load conditions. We set the crane boom length to 19.9 m and perform a lifting task with a 120-degree swing. During the swing, a four-wheeled platform carrying a humanoid model is manipulated to move onto the lifting path. The humanoid model is detected through sensors mounted on the crane and the proposed BEVFusion algorithm. The experiment is conducted with five repeated trials. The starting point, destination point, humanoid model trajectory, and crane position corresponding to each set of parameters are kept unchanged.

The experimental site is shown in Figure 14. To better illustrate the algorithm results, based on the experimental data from the crane, we recreated the test scenario in the virtual world, as depicted in Figure 15a. The actual path of the payload does avoid the humanoid model. To ensure safety in the deployed algorithm, we increase the threshold for the safety distance. Figure 15b shows the distance between the payload and the humanoid model. In real-world experiments, the minimum safety distance remains above 4 m, which is consistent with the simulation results. The consistency between simulation and real-world results indicates that the proposed method maintains stable behavior across different environments under the tested conditions.

To validate the performance of the algorithm when payload exceeds the capacity percentage threshold, we establish container lifting scenes. The global lifting path is calculated by setting the payload starting point at coordinates

(42.51, - 41.91, - 98.49)

, payload destination at coordinates

(42.74, - 59.60, - 98.53),

and crane at coordinates

(46.51, - 50.72, - 99.60)

.

During the lifting process, we simulate torque limiter signals and set them to exceed the capacity percentage threshold. The experimental results are shown in Figure 16 and Figure 17. From Figure 16a and Figure 17a, it can be observed that due to the presence of the green container, the global lifting path tends to lift the payload over the container. This ensures the completion of the lifting task with fewer switching maneuvers. Figure 16b and Figure 17b show that, once the payload exceeds the capacity percentage threshold, the dynamic local lifting planning algorithm increases the luffing angle to raise the safety threshold. This scenario verifies the system’s capability to respond to safety constraints by adjusting the lifting configuration when the load approaches the capacity threshold.

In practice, minor fluctuations in the measured working radius and load ratio may arise from structural vibration of the crane, measurement noise of the torque limiter, and positioning deviations during real-world operation. Despite measurement noise and structural vibration, the proposed method maintains stable decision-making behavior under the tested conditions, suggesting a certain degree of robustness to real-world uncertainty.

In the real crane validation, with the initial work radius of 14.9 m, the rated load capacity of the crane is shown in Table 4. Exceeding the capacity percentage threshold with payload can easily lead to the overturning of the crane. To ensure safety, the crane operated with an empty hook, assuming a payload weight of 6 t. During the lifting process, the torque limiter output is artificially adjusted to exceed 10 t. For the system’s safety, the overload test is conducted using an empty hook with simulated torque limiter signals, which allows for controlled evaluation of the planning response but does not fully represent actual load conditions.

To show the algorithm’s output more intuitively, the test scenario is recreated in a virtual environment based on real crane experimental data. Figure 18a shows the luffing motions triggered during the swing when the payload exceeds the preset capacity percentage threshold, whereas Figure 18b presents the corresponding load capacity curve of the crane during the hoisting process. For example, if the working radius is 14.9 m, the capacity percentage threshold is about 6 tons (i.e., 80% of 7.5 tons). When the load exceeds this limit, the system guides the crane to adjust the working radius to 12.3 m, raising the capacity percentage threshold to approximately 8.24 tons (i.e., 80% of 10.3 tons). Through this adjustment method, the crane lifting capacity is enhanced, reducing the risk of overloading during the lifting process. It should be noted that these experiments are conducted under controlled conditions and are intended to provide an initial validation of the proposed method rather than a comprehensive evaluation in real engineering environments. When the load exceeds the predefined capacity threshold, the system reduces the working radius from 14.9 m to 12.3 m. These results provide initial evidence that the proposed method can proactively modify the lifting configuration to maintain safety under load variations.

Compared with the global planning approach, the proposed dynamic local planning method demonstrates improved performance in terms of collision avoidance and safety response. The global path fails to handle dynamic obstacles and load constraints, while the proposed method consistently adapts to these changes under the tested conditions. Although the evaluation is conducted under controlled scenarios, the results provide consistent quantitative evidence of system performance across different conditions.

5. Conclusions

Intelligent lifting is an engineering challenge that involves multiple fields, such as crane system modeling, environmental perception, and task planning. This work presents an integrated framework for an ILS and divides the system into three types of digital agents: the digital operators, the digital conductors, and the digital supervisors. They undertake execution, guidance, and global planning tasks, respectively. The main contribution of this study lies in integrating perception, planning, and digital twin technologies together into a unified system, rather than focusing on a single algorithm. From a practical perspective, the proposed system has potential applications in lifting scenarios where safety and adaptability are critical, such as construction sites with dynamic obstacles or limited visibility. We construct a virtual lifting environment based on digital twins and design a three-dimensional visualization interface to visually display the lifting process and system states. The system integrates a BEV-based environmental perception method with dynamic lifting path planning. The experimental results provide initial evidence of the feasibility of the proposed system in both simulated and controlled real-world environments. These results should be interpreted as initial evidence rather than a comprehensive assessment of practical performance. Further validation is still needed to fully assess its performance in more complex real-world scenarios.

Although the method shows certain advantages in experiments, there are still limitations in actual conditions. The current perception module mainly relies on offline learning and evaluation, which cannot adapt to the dynamic environment. The planning process requires multi-stage preprocessing, which limits the real-time response in emergency scenarios. In addition, the current open-loop control strategy can be improved in trajectory tracking accuracy and system robustness. Future work can introduce learning-assisted planning methods and adaptive control strategies to enhance the system adaptability and control performance in highly dynamic environments.

Author Contributions

Conceptualization, R.Z. and Y.C.; methodology, R.Z. and Y.M.; software, R.Z. and Y.M.; validation, Y.M.; formal analysis, R.Z.; investigation, R.Z. and Y.M.; resources, R.Z.; data curation, R.Z.; writing—original draft preparation, R.Z. and Y.M.; writing—review and editing, Y.M. and Y.C.; visualization, R.Z.; supervision, Y.C.; project administration, Y.C.; funding acquisition, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

Science and Technology Development Fund: 0029/2022/AGJ and 0029/2023/RIA1.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data generated in this study are not publicly available due to privacy constraints but may be obtained from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kong, S.K.; Ku, Z.F. Causes and Solutions of Construction Crane Accidents in Malaysian Construction Industry. INTI J. 2022. [Google Scholar] [CrossRef]
Sadeghi, S.; Soltanmohammadlou, N.; Rahnamayiezekavat, P. A systematic review of scholarly works addressing crane safety requirements. Saf. Sci. 2021, 133, 105002. [Google Scholar] [CrossRef]
Lee, J.; Phillips, I.; Lynch, Z. Causes and prevention of mobile crane-related accidents in South Korea. Int. J. Occup. Saf. Ergon. 2022, 28, 469–478. [Google Scholar] [CrossRef] [PubMed]
Xiao, Y.; Yang, T.Y.; Pan, X.; Xie, F.; Chen, Z. A reinforcement learning based construction material supply strategy using robotic crane and computer vision for building reconstruction after an earthquake. In Proceedings of the Canadian Conference on Earthquake Engineering, Vancouver, BC, Canada, 25–30 June 2023. [Google Scholar]
Golcarenarenji, G.; Martinez-Alpiste, I.; Wang, Q.; Alcaraz-Calero, J.M. Machine learning-based top-view safety monitoring of ground workforce on complex industrial sites. Neural Comput. Appl. 2022, 34, 4207–4220. [Google Scholar] [CrossRef]
Zhang, M.; Ge, S. Vision and trajectory–based dynamic collision prewarning mechanism for tower cranes. J. Constr. Eng. Manag. 2022, 148, 04022057. [Google Scholar] [CrossRef]
Gutierrez, R.; Magallon, M.; Hernandez, D.C. Vision-based system for 3D tower crane monitoring. Sensors 2021, 21, 11935–11945. [Google Scholar] [CrossRef]
Stereolabs. Built for the Spatial AI. 2022. Available online: https://www.stereolabs.com/zed-2i/ (accessed on 9 January 2023).
Li, Z.; Wang, W.; Li, H.; Xie, E.; Sima, C.; Lu, T.; Qiao, Y.; Dai, J. BEVFormer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. In Proceedings of the European Conference on Computer Vision (ECCV); Springer: Cham, Switzerland, 2022; Volume 13669, pp. 1–18. [Google Scholar]
Huang, J.; Huang, G.; Zhu, Z.; Ye, Y.; Du, D. BEVDet: High-performance multi-camera 3D object detection in bird-eye-view. arXiv 2022, arXiv:2112.11790. [Google Scholar] [CrossRef]
Huang, J.; Huang, G. BEVDet4D: Exploit temporal cues in multi-camera 3D object detection. arXiv 2022, arXiv:2203.17054. [Google Scholar] [CrossRef]
Liu, Z.; Tang, H.; Amini, A.; Yang, X.; Mao, H.; Rus, D.L.; Han, S. BEVFusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA); IEEE: New York, NY, USA, 2023; pp. 2774–2781. [Google Scholar]
Tian, Y.; Li, X.; Wang, K.; Wang, F.Y. Training and testing object detectors with virtual images. IEEE/CAA J. Autom. Sin. 2018, 5, 539–546. [Google Scholar] [CrossRef]
Kang, S.; Miranda, E. Planning and visualization for automated robotic crane erection processes in construction. Autom. Constr. 2006, 15, 398–414. [Google Scholar] [CrossRef]
Han, S.H.; Hasan, S.; Bouferguène, A.; Al-Hussein, M.; Kosa, J. Utilization of 3D visualization of mobile crane operations for modular construction on-site assembly. J. Manag. Eng. 2015, 31, 04014080. [Google Scholar] [CrossRef]
Kang, S.-C.; Chi, H.-L.; Miranda, E. Three-dimensional simulation and visualization of crane assisted construction erection processes. J. Comput. Civ. Eng. 2009, 23, 363–371. [Google Scholar] [CrossRef]
Soltani, A.R.; Tawfik, H.; Goulermas, J.Y.; Fernando, T. Path planning in construction sites: Performance evaluation of the Dijkstra, A*, and GA search algorithms. Adv. Eng. Inform. 2002, 16, 291–303. [Google Scholar] [CrossRef]
Wang, X.; Lin, Y.S.; Wu, D.; Zhang, C.W.; Wang, X.K. Path planning for crane lifting based on bi-directional RRT. Adv. Mater. Res. 2012, 446, 3820–3823. [Google Scholar] [CrossRef]
Ali, M.A.D.; Babu, N.R.; Varghese, K. Collision free path planning of cooperative crane manipulators using genetic algorithm. J. Comput. Civ. Eng. 2005, 19, 182–193. [Google Scholar] [CrossRef]
Hussein, M.; Zayed, T. Crane operations and planning in modular integrated construction: Mixed review of literature. Autom. Constr. 2021, 122, 103466. [Google Scholar] [CrossRef]
Hu, X.; Chen, L.; Tang, B.; Cao, D.; He, H. Dynamic path planning for autonomous driving on various roads with avoidance of static and moving obstacles. Mech. Syst. Signal Process. 2018, 100, 482–500. [Google Scholar] [CrossRef]
Zhu, A.; Zhang, Z.; Pan, W. Crane-lift path planning for high-rise modular integrated construction through metaheuristic optimization and virtual prototyping. Autom. Constr. 2022, 141, 104434. [Google Scholar]
Dutta, S.; Cai, Y.; Huang, L.; Zheng, J. Automatic re-planning of lifting paths for robotized tower cranes in dynamic BIM environments. Autom. Constr. 2020, 110, 102998. [Google Scholar] [CrossRef]
Zhang, Z.; Pan, W. Lift planning and optimization in construction: A thirty-year review. Autom. Constr. 2020, 118, 103271. [Google Scholar] [CrossRef]
Hu, S.; Fan, Y.; Bai, Y. Automation and optimization in crane lift planning: A critical review. Adv. Eng. Inform. 2021, 49, 101346. [Google Scholar] [CrossRef]
Kayhani, N.; Taghaddos, H.; Mousaei, A.; Behzadipour, S.; Hermann, U. Heavy mobile crane lift path planning in congested modular industrial plants using a robotics approach. Autom. Constr. 2021, 122, 103508. [Google Scholar] [CrossRef]
Hu, S.; Fang, Y.; Guo, H. A practicality and safety-oriented approach for path planning in crane lifts. Autom. Constr. 2021, 127, 103695. [Google Scholar] [CrossRef]
Cai, P.; Chandrasekaran, I.; Zheng, J.; Cai, Y. Automatic path planning for dual-crane lifting in complex environments using a prioritized multiobjective PGA. IEEE Trans. Ind. Inf. 2018, 14, 829–845. [Google Scholar] [CrossRef]
Sivakumar, P.Á.; Varghese, K.; Babu, N.R. Automated path planning of cooperative crane lifts using heuristic search. J. Comput. Civ. Eng. 2003, 17, 197–207. [Google Scholar] [CrossRef]
Reddy, H.R.; Varghese, K. Automated path planning for mobile crane lifts. Comput.-Aided Civ. Infrastruct. Eng. 2002, 17, 439–448. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, B.; Hu, W.; Zhou, R.; Cao, D.; Yin, H. Dynamic Three-Dimensional Lift Planning for Intelligent Boom Cranes. IEEE/ASME Trans. Mechatron. 2023, 28, 2885–2896. [Google Scholar] [CrossRef]

Figure 1. Comparison between intelligent lifting system and traditional lifting system.

Figure 2. Concept framework of intelligent lifting system.

Figure 3. Architecture of the object detection and evaluation system.

Figure 4. Description of payload position by Cartesian coordinate and C-Space.

Figure 5. Possible collision among boom, obstacle, and payload. (a) boom, obstacle, and payload (b) the critical points of the collision boundary.

Figure 6. Sensor deployment scheme.

Figure 7. Perception dataset. (a) Sufficient light condition. (b) Insufficient light condition.

Figure 8. Object detection and localization results.

Figure 9. Models of 3D artificial scenes.

Figure 10. Data from the artificial scenes.

Figure 11. Lifting path planning in 3D construction site scenes: (a) global lifting path; (b) dynamic local lifting path.

Figure 12. Crane movements in 3D construction site scenes: (a) crane movements of global lifting path; (b) crane movements of dynamic local lifting path.

Figure 13. Distance between payload and pedestrian.

Figure 14. The sudden intrusion of obstacles at the experimental site.

Figure 15. Experiment results for sudden intrusion of obstacles: (a) dynamic local lifting path; (b) distance payload and pedestrian.

Figure 16. Lifting path planning in 3D container lifting scenes: (a) global lifting path; (b) dynamic local lifting path.

Figure 17. Crane movements in 3D container lifting scenes: (a) crane movements of global lifting path; (b) crane movements of dynamic local lifting path.

Figure 18. Experiment results for payload exceeding the capacity percentage threshold: (a) dynamic local lifting path; (b) rated load capacity.

Table 1. Comparison of considered factors in lifting path planning.

Considered Factors		Traditional Lift Path Planning	Proposed Dynamic Lifting Path Planning
External factors	3D lifting map	√	√
	Load weight	√	√
	Load dimensions	√	√
	Newly introduced static obstacles	--	√
	Newly introduced dynamic obstacles	--	√
Internal factors	Boom length	√	√
	Lifting radius	√	√
	Real-time load capacity	--	√

Table 2. Pseudo code for the improved dynamic local lifting path planning.

Input: Start configuration

(θ_{s}, σ_{s}, r_{s})

; goal configuration

(θ_{g}, σ_{g}, r_{g})

;
three-dimensional configuration space (C-space) grid map (if available);
real-time environment and safety parameters.
Output: Executable lifting path or FAIL (manual takeover)

Digital supervisor: Input the three-dimensional start configuration $(θ_{s}, σ_{s}, r_{s})$ and the goal configuration $(θ_{g}, σ_{g}, r_{g})$ .
Detect whether a grid map exists.
▯
If no grid map exists, construct a discretized three-dimensional configuration space (C-space) grid map over the $θ - σ - r$ domain.
▯
If exists, proceed to execute.
Digital supervisor: Compute the global lifting path based on [31].
Digital operator: Execute the global lifting path.
Digital conductor: Monitor the internal and external conditions of the crane in real-time.
▯
If a new obstacle is detected, map the obstacle onto the 3D C-Space grid map and proceed to Step 6.
▯
If $p (k - 1) \geq 0.8 \bar{p} (k - 1)$ and $θ (k) < θ (k - 1)$ , let $h' (k) = \infty$ and proceed to Step 8
Digital conductor: Compute the local lifting path by setting 3D current $(θ_{c}, σ_{c}, r_{c})$ and 3D goal $(θ_{g}, σ_{g}, r_{g})$ output for the local path.
Digital operator: Execute the local lifting path.
Digital conductor: Terminate the computation if the hoisted load reaches the target location; otherwise, return to Step 4.
Digital Operator: Planning failed, exiting planning, executing manual operation.

Table 3. Performance of BEVFusion under different image and point cloud voxel sizes.

	0.075	0.1	0.125
Image Size	0.075	0.1	0.125
128 × 352	77.4	76.9	76.5
256 × 704	79.0	77.7	77.5
384 × 1056	80.0	79.8	79.2

Table 4. Rated load capacity of boom length of 19 m.

Work radius (m)	16	14	13	12	11	10	9	8	3–7
Rated load capacity (t)	6.4	8.2	9.3	10.8	12.6	14.5	16.5	18	18.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, R.; Miao, Y.; Chen, Y. Intelligent Lifting Systems Based on Digital Operators, Conductors and Supervisors. Appl. Sci. 2026, 16, 4270. https://doi.org/10.3390/app16094270

AMA Style

Zhou R, Miao Y, Chen Y. Intelligent Lifting Systems Based on Digital Operators, Conductors and Supervisors. Applied Sciences. 2026; 16(9):4270. https://doi.org/10.3390/app16094270

Chicago/Turabian Style

Zhou, Rui, Yuanrong Miao, and Yufeng Chen. 2026. "Intelligent Lifting Systems Based on Digital Operators, Conductors and Supervisors" Applied Sciences 16, no. 9: 4270. https://doi.org/10.3390/app16094270

APA Style

Zhou, R., Miao, Y., & Chen, Y. (2026). Intelligent Lifting Systems Based on Digital Operators, Conductors and Supervisors. Applied Sciences, 16(9), 4270. https://doi.org/10.3390/app16094270

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Lifting Systems Based on Digital Operators, Conductors and Supervisors

Abstract

1. Introduction

2. Intelligent Lifting System

Intelligent Lifting System Framework

3. Key Technologies of Intelligent Lifting System

3.1. Perception Based on BEVFusion

3.2. Dynamic Local Lifting Path Planning

4. System Realization

4.1. Perception Based on BEVFusion

4.2. Dynamic Local Lifting Path Planning

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI