1. Introduction
Construction activities are challenged by problems such as labor-intensive jobs, labor shortages, hazardous work environments, and occupational diseases compared to the manufacturing sector. To solve these problems, new technologies like robotics and automation, artificial intelligence (AI), and information and communication technology (ICT) have attempted to be used in the field of construction projects to improve efficiency, productivity, safety, and sustainability in recent decades [
1,
2,
3,
4].
Ever since the successful verification of large-scale industrialized and robotized prefabrication of system houses in the Japanese market, Shimizu has focused on the research and development of on-site construction robots, starting from the 1970s [
5]. This can be seen as the initial attempt of construction robots. Nowadays, more and more Single-Task Construction Robots (STCRs) are being developed to be easily deployed on construction sites for various repetitive, dangerous, and physically demanding tasks with global research efforts [
5,
6,
7,
8,
9,
10]. Applications for intelligent machines or robots on construction sites are promising.
However, STCRs have many problems, such as not being able to cooperate well with workers and not being able to perform tasks. So, robots are designed to assist human workers thus far [
11]. Generally speaking, robots have been designed and developed based on human requirements but autonomous robots need to be more intelligent to tackle more complex issues such as uncertainty and unpredictable situations [
12]. The concept of human–robot collaboration (HRC) in construction presents a promising avenue to elevate the effectiveness and competitiveness of the industry by leveraging the unique strengths of both human workers and robots. HRC has the benefits of both human experience and sophisticated robotic technical performance.
Research indicates that HRC has the potential to significantly enhance system performance and efficiency in construction projects [
13]. However, successful HRC requires careful consideration of human factors and their roles in the collaborative process due to their different characteristics (see
Table 1).
The use of robots in construction promises to be an ideal solution for improving productivity, quality, and safety in construction projects. The team is a basic unit in the construction project, and this research aims to provide a better understanding of the potential of AI and construction technologies to enhance construction project performance.
Assessing team performance in HRC settings presents several challenges. Traditional performance metrics often focus on single dimensions and do not account for the unique dynamics of HRC in the construction industry. The complexity of construction tasks, the need for the integration of human and robots, and the dynamic nature of construction sites make it difficult to comprehensively evaluate team performance.
This study aims to systematically develop a framework of indicators for HRC team performance in the construction environments and an evaluation model based on methods of the Analytic Hierarchy Process (AHP) and the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS), which hopes to offer decision-making support to construction projects on deploying HRC systems effectively and provide reference for the design of construction robots. The proposed framework considers performance indicators systematically from different dimensions, such as productivity, safety, creativity, and satisfaction, which are goals of project management and team management.
4. Conceptual Framework of HRC Team Performance
HRC in construction projects is becoming increasingly significant due to the need for enhanced productivity, safety, and quality [
90,
91]. These are not only widely recognized as key goals for HRC applications in construction projects [
92], but are also related to the practical needs and constraints of management of the built environment [
93]. In traditional construction projects, human–human teams accomplish tasks by pursuing goals such as progress, quality, safety, and cost [
94]. In the scenario of HRC, the dimension of performance is also expanded due to the addition of intelligent machines, so the goals also need to be interpreted more broadly. Therefore, HRC team performance in construction projects must be evaluated across multiple dimensions to capture the full scope of human–robot work.
In constructing the conceptual model, performance dimensions of productivity, safety, and quality were retained because each captures a distinct way that HRC reshapes construction work. Productivity matters because collaborative robots assume high-load, repetitive, or precision tasks while humans coordinate, supervise, and intervene at constraint points [
95]. Output depends on the fluency of this collaboration. Safety must be reconsidered once workers and robots share a dynamic workplace rather than being physically segregated. In this situation, risk becomes time-varying, mediated by perception–response cycles, and highly sensitive to workers understanding of robot motion intent [
96]. Quality extends beyond traditional understanding of construction quality. Robotic precision only translates into durable results when integrated with human preparation, finishing, and inspection under fluctuating workloads and trust conditions [
97]. In addition, the introduction of intelligent and reprogrammable robots into dynamic site conditions exposes two additional performance needs that are not well captured by traditional indicators. First, construction work conditions change daily, so the value of HRC depends on how rapidly HRC team can reconfigure tasks, recover from disruptions, and sustain operational continuity when conditions deviate from plan, which brings this emergent capability as flexibility paired with reliability under change [
92]. Second, when robots assume repetitive, hazardous, or high-precision tasks, human workers can improve cognitive and temporal capacity. Projects benefit when this released capacity is engaged in on-site problem solving, process improvement, and innovation, which can be demonstrated as the dimension of team creativity [
98,
99]. Together, flexibility and team creativity distinguish HRC deployments that merely automate existing routines from those that unlock adaptive and innovative value in construction operations. These two HRC-specific dimensions therefore extend the traditional project performance triad and are essential to a complete conceptual account of HRC team performance.
Overall, the proposed conceptual framework defines five key performance dimensions, including productivity, safety, quality, flexibility, and team creativity. To make these dimensions more theoretically persuasive, these dimensions emerged from a careful integration of three theoretical foundations as the following: Cognitive Load Theory (CLT) [
100], Technology Acceptance Model (TAM) [
101], and Team Role Theory (TRT) [
101].
The five performance dimensions in our framework (productivity, safety, quality, flexibility, and team creativity) are deductively justified from three complementary theoretical lenses—Cognitive Load Theory (CLT), the Technology Acceptance Model (TAM), and Team Role Theory (TRT)—when human–robot collaboration (HRC) is examined as a socio-technical production system in dynamic site conditions. Mapping these theories onto an Input–Process–Output (IPO) logic clarifies why each dimension must be represented to capture HRC team performance. This integration provides conceptual coverage across (1) human information-processing capacity and error susceptibility (supported by CLT), (2) perceived usefulness, usability, reliability, and behavioral intention to use robotic technology (supported by TAM), and (3) role differentiation, creativity, and coordination dynamics in mixed teams (supported by TRT). The CLT helps assess the mental workload of human workers during collaboration, which is critical for optimizing the human–robot interface and ensuring worker well-being. The TAM theory provides a lens for understanding how workers accept and use new technology, which helps in evaluating the robot’s usability and the workers’ willingness to collaborate—key factors influencing team performance. The TRT allows humans and robots to be viewed as a collaborative team, analyzing their role allocation, coordination, and communication. This moves beyond the traditional view of a robot as a mere tool.
By integrating these three theories, the framework ensures that performance is understood at the human level (cognitive and behavioral factors), robot level (technological and task factors), and team level (collaborative factors). In addition, to systematically evaluate HRC team performance, this paper organizes the framework using an IPO model across three levels (human, robot, team), which was used in performance research [
102]. In this study, an IPO model conceptualizes how antecedent conditions (inputs) and interactive behaviors (processes) lead to outcomes (outputs). Input is the conditions present before collaboration, such as human worker operation skills and mental situation evaluated by the CLT to avoid overload, robot’s features and capabilities evaluated based on the TAM, and team composition based on the TRT. Processes are the interactions and behaviors as humans and robots work together, like a human’s decision-making and attention management under the CLT, the robot’s task execution and adaptability under TAM, and the team’s coordination based on the TRT. Outputs are the resulting performance outcomes valued by the project, measured across multiple dimensions.
Figure 2 illustrates a three-level, three-stage, five-dimension framework, showing how inputs at the human, robot, and team levels feed into collaborative processes and yield performance outputs. The resulting five dimensions reflect the expanded performance goals for HRC teams, each justified by specific theoretical insights and the unique demands of human–robot collaboration in construction as the following:
HRC productivity is maximized when technology is readily adopted and each team member (human or robot) focuses on the tasks they perform best, a synergy well-supported by the CLT and TAM. The CLT shows that reassigning high workload tasks to robots prevents human cognitive overload and preserves decision speed. The TAM indicates that actual productivity gains can be achieved only when workers perceive robots as useful and easy to deploy. Building on these theoretical expectations, this paper defines productivity to include not only task speed and the input–output ratio, which can be equated to schedule and cost, but also the degree of coordination between humans and robots, as well as reductions in physical and cognitive workload through shared task execution. Studies have shown that HRC can significantly enhance construction productivity by enabling robots to handle repetitive or hazardous tasks, allowing human workers to focus on more complex activities [
13,
103].
Compared with traditional construction projects, HRC has brought about changes in technology, management, and cognition, leading to the transformation of the concept of safety. The concept of safety is undergoing a deep transformation from passive reaction to proactive response [
104], from spatial isolation to human–robot coexistence [
105], and from rule-based operation to cognitive collaboration [
106]. From the technology perspective, safety is no longer centered on physical isolation due to the collaborative robots equipped with advanced perception, active learning, and decision-making capabilities. Instead, safety relies on the safety awareness and intelligent reaction of collaborative robots based on their response to the surroundings. In terms of management, traditional construction sites relied on clearly defined boundaries between workers and machines, while humans and robots collaborated within shared spaces in HRC scenarios. In this situation, safety shifts from keeping distance to adaptive interaction and collaboration. To cognition, traditional machines are tools with clear rules and predictable behavior, which require workers to have basic skills and knowledge. The relationship between workers and machines is linear and low in cognitive workload, and safety relies more on physical protection and worker experience. With collaborative robots, safety depends on how workers understand, predict, and cooperate with the behavior of robots, which increases the cognitive load of workers [
107,
108]. The situational awareness and technical ability of workers become a critical part of safety [
109]. The concept of safety is no longer based on rule constraints, but rather a cognitive collaborative process by human workers and robots. Also, safety in HRC is likewise theory-coupled. The CLT presents that elevated cognitive load and divided attention undermine situational awareness, motivating dynamic monitoring indicators such as real-time separation, hazard response, and human awareness measures. The TAM contributes by linking perceived reliability and trust in robot safety functions to worker risk-taking and intervention behavior. The TRT means that stop-authority must be distributed across team roles. Thus, coordinated hazard response time and risk-assessment updating become key measures. In sum, the CLT, TAM, and TRT are crucial for HRC safety.
The focus of quality is on technical accuracy or standard compliance in traditional construction projects. In HRC, the physical and psychological health of human workers also needs to be an important part, which is the concept of job quality [
110,
111]. Robotic systems are expected to not only improve the precision of tasks but also reduce the physical stress and cognitive burden of workers [
46]. The CLT shows that cognitive overload, fatigue, and attention switching degrade inspection accuracy. The TAM reinforces this by highlighting that worker satisfaction and comfort are critical for technology adoption [
112]. The TRT demonstrates that complementary roles can raise first-pass yield and reduce rework in teams. The quality dimension thus encompasses indicators like physical and cognitive workload, and overall collaboration satisfaction, reflecting the goal of a healthier, less stressful work process for humans in HRC.
In addition, flexibility has become a key factor in HRC performance [
113]. This is because the construction environment is complex and variable, including constantly changing tasks, processes, and scenarios [
114,
115]. Unlike the relatively static work environment and processes in the manufacturing industry, construction project HRC systems must adapt to changing conditions in real time, so flexibility is a key criterion for team performance. The CLT highlights the cognitive burden of changeovers and the need for schema transfer when tasks shift. The TAM is also relevant. For a robot to be useful across varied scenarios, it must be perceived as flexible and easy to adjust. The TRT informs this dimension that a well-coordinated team can adjust who takes on subtasks when conditions shift, improving adaptive capacity. Therefore, this paper operationalizes flexibility through indicators that capture uptime, reconfiguration time, environmental adaptation, stability under stress, and robustness against disturbances.
The last important dimension of human–robot collaboration (HRC) is team creativity due to the collaboration [
116,
117]. The CLT predicts that when repetitive and high-workload activities are automated, humans can recover cognitive bandwidth for creative thinking and opportunistic problem solving. The TAM suggests that positive perceptions of robotic usefulness encourage experimentation with new robot-enabled workflows. The TRT proves that heterogeneous teams are known to spark creativity because each member brings unique capabilities. The robot’s precision and data processing combined with the human’s experiential knowledge and intuition can lead to creative synergies that neither could achieve alone. When robots take over repetitive, hazardous, or very precise tasks—such as continuous welding, bricklaying, or heavy lifting—human workers are freed from routine physical work and mental load [
118]. They can then focus on higher-level activities like planning the next work steps, adjusting site logistics when deliveries change, or exploiting the robot’s new abilities [
119]. In this context, creativity emerges as on-site innovations and real-time problem solving by the human–robot team. For example, workers can modify a robot’s path to avoid an unexpected obstacle. The TRT also emphasizes the importance of an open, communicative team climate. Thus, the value of HRC lies not only in higher efficiency but also in a rapid feedback cycle in which workers observe robot performance, propose improvements, and gradually reshape construction practice.
Considering these evolving expectations, this study proposes a framework including five dimensions, productivity, safety, quality, flexibility, and team creativity, which provides a comprehensive foundation for evaluating HRC team performance in the context of construction projects.
5. Identification of Indicators of Different Dimensions of HRC Team Performance
To operationalize the conceptual framework introduced in
Section 4, the present study establishes a set of measurable indicators that span five performance dimensions of HRC in construction projects.
The indicator set presented in this section was established through a three-stage sourcing process that integrates conceptual robustness with empirical and practical relevance. The first stage is theoretical construction. Starting from the Cognitive Load Theory, Technology Acceptance Model, and Team Role Theory introduced in
Section 4, we deduced the latent performance constructs that must be operationalized, like productivity, proactive safety, adaptive capacity, and creativity. The second stage is literature synthesis. A structured scan of peer-reviewed articles identified indicators that recur across studies of construction robotics and human factors. This step ensured empirical recurrence and terminological consistency with the extant body of knowledge. The third stage is expert consultation. Discussion with a panel of seven domain experts (two HRC-related researchers, three construction project managers, two robot engineers) assessed practical feasibility and site-level data availability.
Only indicators that passed all three gates were retained. The final catalog comprises 33 indicators, distributed across five performance dimensions (see
Table 2). The following subsections present each dimension in turn, preserving the original indicators while clarifying their roles.
5.1. Productivity Indicators
Productivity is the foremost dimension considered when assessing whether HRC brings value on site, yet a single speed-oriented variable rarely reveals the full picture. In the present framework, five complementary indicators were retained, namely task completion time, production capacity, the human–robot ratio, the human–robot time ratio, and collaboration efficiency.
Task completion time shows how long one task takes from the start to final acceptance, so it reflects schedule control. Production capacity records the physical output produced per unit time and adds a throughput view. The human–robot ratio describes how many workers and robots are assigned to the task, while the human–robot time ratio compares the actual labor hours each side puts in. Collaboration efficiency relates total output to the combined inputs of people and robots, giving an overall resource picture. These five indicators together make it possible to see whether faster progress comes from real process improvement or simply from adding more manpower or machines, and whether extra robots truly raise throughput in proportion.
5.2. Safety Indicators
Safety is the basic pre-condition for any human–robot collaboration. Without protection measures, productivity gains hold little practical meaning. In this study nine indicators are kept giving a complete and practical view of safety performance.
At the physical level, the first group of indicators considers the real-time distance that the robot maintains from workers and the peak forces recorded when contact occurs, ensuring conformance with the ISO/TS 15066:2016 [
152] limits. The second group of indicators captures responsive capability, or how quickly the robot moves to a safe state after a hazard is detected and how often it can recognize. Safety also depends on workflow design and operator awareness. Safe collaboration efficiency shows how much productive time is lost because the robot stops for safety. The risk-assessment update frequency tells whether formal reviews keep pace with changing site conditions. Situation awareness level, trust degree, and human error rate focus on the human side, showing how well workers understand the robot, how much they trust it, and how often misunderstandings lead to mistakes.
Taken together, these nine indicators connect hardware compliance, control responsiveness, procedure management, and human factors.
5.3. Task and Collaboration Quality Indicators
Quality indicates whether work delivered by the human–robot team meets the required specification and whether the collaboration feels trustworthy and comfortable. Approved co-executed tasks and overall effectiveness report the share of jobs that pass inspection on the first attempt, while the rework incidents reveal hidden defects that require correction. Confidence in collaboration quality records, the degree of reliability perceived by workers, and collaboration satisfaction offers a broader verdict on cooperative experience. Physical workload, cognitive workload, and perceived stress level describe the muscle effort, mental effort, and psychological stress tolerance by workers. Worker collaboration experience captures the intuitiveness of the interface, whereas intention recognition measures how often the robot correctly understands human commands or gestures at the first attempt. Together, these nine indicators present a rounded view of technical application, task quality performance, and worker well-being during HRC.
5.4. Flexibility and Reliability Indicators
Construction sites change quickly and robots must keep running in complex and dangerous environments. Flexibility and reliability are described through five indicators that trace system availability, adaptability, and endurance. Collaboration duration records how long the human–robot system stays operational within a given observation window. Collaborative task reconfiguration time measures how fast workers can break down a new task, generate fresh robot paths, and return to normal work when a plan or site layout is modified. Environmental adaptation ability shows the proportion of unexpected surroundings that the system can handle without pausing, indicating how well perception and control cope with real-world variability. Stability under extreme conditions looks at the loss of accuracy or throughput when the system faces harsh conditions. Finally, robustness reflects the share of output that stays within quality and safety limits when several disturbances occur at once, for example, a task change combined with sensor noise and network communication delay. Together these indicators offer a view of how the human–robot team can deliver work while adapting to construction sites.
5.5. Collaborative Creativity Indicators
Creativity is expected to emerge when robots take over routine or hazardous tasks and leave humans with time for exploration and problem solving. Six indicators are used to follow this process from the earliest psychological conditions to the final project outcomes.
Perceived collaboration creative climate and creative task willingness show whether the workplace encourages the sharing of ideas and whether workers are motivated to take part in higher-level problem solving after routine duties have been handed over to robots. Creative contribution measures the actual number of original and suitable solutions that arise under these conditions. An adopted innovation proposal denotes worker’s suggestions that are approved for use, and implemented robot-generated alternatives denote the autonomous options produced by the robot. Creative value-added links the adopted ideas to concrete benefits, such as cost savings, shorter schedules, or higher quality. The six indicators present a chain from climate and willingness through idea creation to measurable project value, which provides a clear basis for judging the creative return brought by HRC in construction projects.
6. Validation, Weighting, and Evaluation of the Performance Framework
The initial set of 33 indicators, derived from theory and the literature, required empirical validation to ensure their relevance and importance in the context of construction projects. Furthermore, to enhance the practical utility of the framework, it is necessary to move beyond an assumption of equal importance for all indicators. This section details the two-stage process of validating the indicator set and establishing a quantitative weighting scheme. First, an expert survey was used to screen the indicators for importance and reliability. Second, the Analytic Hierarchy Process (AHP) was employed to determine the relative weights of the retained indicators. Third, a comprehensive model based on the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) was established to evaluate the overall performance of an HRC team.
6.1. Validation of the Proposed Indicators
The validation of the indicators developed for assessing the performance of human–robot collaboration (HRC) teams in construction projects is a crucial step to ensure their reliability and applicability. This section explains how the 33 performance indicators for HRC teams were refined and then empirically validated.
To confirm the relevance and reliability of the proposed human–robot collaboration (HRC) performance indicators, the present study adopted questionnaire-based quantitative validation. The entire procedure was organized in four sequential steps, including expert selection, questionnaire design, data collection and analysis, and reliability and retention tests.
Firstly, fifteen specialists who possess experience in smart construction were invited to complete the survey. Their profile is shown in
Table 3. Their institutional distribution is as follows: seven experts came from universities and research institutes, four from construction contractors, two from construction-robotics technology firms, and two from a government department. This composition ensures balanced input from academic, industrial, and regulatory perspectives. The gender distribution corresponds to the actual demographics of construction fields, especially in roles like site management, which are still male dominated. Researchers were well represented (46.7%) because HRC is still in a developmental and research-oriented stage, and they have adequate theoretical knowledge good for establishing the framework. Most participants had at least three years of professional practice in construction. This distribution reflects the current reality of professionals actively engaged in HRC research and practice, which provides a range of experience levels ensures both depth and currency in the survey feedback. HRC in construction is still an emerging area, and the majority of those working on it are relatively younger professionals, such as early-career researchers and engineers, who tend to fall within the 3–5 year experience range.
Secondly, the questionnaire comprised 33 indicators covering five dimensions: Productivity (P), Safety (S), Collaboration Quality (Q), Functional Adaptability (F), and Creativity (C). Each item was evaluated on a five-point Likert scale (1 = “Not important at all”, 5 = “Extremely important”). In addition to the rating scale, a brief definition and an illustrative construction-site scenario were provided for every indicator to ensure common understanding among respondents.
Thirdly, electronic questionnaires were distributed online, and all fifteen experts returned valid responses, yielding a response rate of 100%. After data cleaning, three descriptive indices—mean value (
), standard deviation (SD), and rank order (R)—were calculated for each indicator using SPSS 26.0 (see
Table 4). These indices quantify, respectively, the perceived importance, the consensus level, and the relative priority of every indicator.
Finally, to examine internal consistency, Cronbach’s alpha (α) was employed. The computed value was 0.916, which exceeds the commonly accepted threshold of 0.9 and therefore indicates excellent reliability. This result also indicates that the respondents held a highly uniform understanding of the indicators and that the dataset is statistically sound for subsequent analysis. Following mainstream construction-management studies, an important cut-off of 3.0 (the Likert mid-point) was adopted:
Applying this rule produced 25 retained and 8 discarded indicators. Examples of the former include task completion time (P1) and dynamic safety distance (S1), and examples of the latter include trust degree (S8) and human error rate (S9). The high Alpha coefficient confirms that these retention decisions rest on a stable and reliable empirical foundation.
The expert panel’s ratings reveal that the eight indicators listed in
Table 5 all fall at or below the 3.0 importance threshold. The eight indicators that failed to clear the 3.0 cut-off were discarded. In brief, some items were judged to be redundant because their intent is captured by higher-scoring, easier-to-measure indicators (P4 versus P1/P2 and Q5 versus Q4). Others blend multiple constructions or lack an industry standard, making them difficult to define or operationalize with confidence (S5). Several indicators describe antecedent conditions, which means factors that influence performance rather than express it, so they are excluded (S8 and Q8). Two items raise major data-collection or attribution challenges (S9 and Q7). C5 concerns a practice that remains rare in current situations. Together, their low means can be explained by a combination of conceptual redundancy, representing factors rather than direct performance outcomes, measurement difficulty, and current technological maturity in construction HRC sites nowadays.
6.2. Determination of Indicator Weights Using AHP
While the initial validation confirmed the importance of the 25 indicators, it did not address their relative impact on the overall HRC team performance. Treating all indicators as equal could lead to misallocated resources and inaccurate performance assessments. To address this, AHP was employed to establish a scientific weighting scheme. AHP is a structured technique for organizing and analyzing complex decisions, based on decomposing a problem into a hierarchy and using comparison matrices to derive relative importance of decision elements [
153].
The same panel of 15 experts was engaged to conduct pairwise comparisons for the five dimensions and the indicators within each dimension, using 1–9 scale. The judgments were synthesized to calculate the local and global weights for each indicator. The consistency of the judgments was verified using the Consistency Ratio (CR), with all the matrices achieving a CR below the 0.10 threshold, confirming the reliability of the expert inputs. The final calculated weights are presented in
Table 6.
The AHP results reveal a clear hierarchy of priorities. At the dimension level, Safety (WS = 0.2708) emerged as the most critical factor, surpassing Productivity (WP = 0.2327). This finding is significant, as it challenges the common assumption that robots are introduced into construction primarily for productivity gains. The expert panel’s prioritization suggests that the primary concern and potential barrier to HRC adoption is not cost or speed, but rather effective management of new and complex safety risks that arise when humans and robots share a dynamic workspace. This establishes safety performance as the fundamental prerequisite upon which all other performance benefits, including productivity, must be built. This has profound implications for the design of collaborative robots, which must prioritize robust hazard detection and human-centric safety protocols over pure operational speed.
At the indicator level, the new weighting scheme provides a more detailed view of performance priorities. Task completion time (P1) remains the highest-ranked indicator (WP1 = 0.0763), reflecting the importance of schedule performance in construction. The top five is populated by a mix of safety, productivity, and creativity indicators: dynamic separation distance (S1) (WS1 = 0.0667), collaborative robot’s autonomous hazard identification rate (S4) (WS4 = 0.0580), production capacity (P2) (WP2 = 0.0571), and creative value added (C6) (WC6 = 0.0542). The high ranking of S1 and S4 emphasizes that both physical safety (maintaining distance) and system intelligence (autonomous hazard detection) are crucial. The significant weight given to C6 indicates a growing recognition that the ultimate value of HRC lies not just in performing existing tasks faster, but in enabling innovative solutions and creating new value that were previously unattainable.
6.3. A TOPSIS-Based Evaluation Model for HRC Team Performance
TOPSIS is widely used for ranking a set of alternatives based on multiple criteria [
154]. The core principle of TOPSIS is that the optimal alternative should have the shortest geometric distance from the Positive Ideal Solution (PIS) and the longest geometric distance from the Negative Ideal Solution (NIS). The PIS represents the best possible performance score for each criterion, while the NIS represents the worst. This approach is particularly well-suited for evaluating complex systems like HRC teams, as it provides a single, comprehensive score that reflects overall performance across all dimensions. The methodology is adapted from its successful application in evaluating the resilience of complex engineering systems [
155,
156].
The TOPSIS evaluation process involves the following six steps.
Step 1: Construct the initial decision matrix (
X). Assume that
n alternatives are evaluated against m indicators and an initial decision matrix
X is formed:
where
is the performance score of the i-th alternative on the j-th indicator.
Step 2: Normalize the decision matrix (
R). To eliminate dimensional inconsistencies and allow for comparison, the matrix
X is normalized using the vector normalization method:
where
is the normalized score.
Step 3: Construct the weighted normalized decision matrix (
V). The normalized matrix
R is then multiplied by the indicator weights (
) derived from the AHP analysis in
Section 6.2. This step integrates the relative importance of each indicator into the model.
where
is the weight of the
j-th indicator, and
.
Step 4: Determine the Positive Ideal Solution (PIS,
A+) and Negative Ideal Solution (NIS,
A−). The PIS and NIS are identified from the weighted normalized matrix
V.
where
is the PIS of
j-th indicator and
is the NIS of
j-th indicator.
Step 5: Calculate separation measures. The Euclidean distance of each alternative from the PIS (
) and the NIS (
) is calculated.
Step 6: Calculate the relative closeness to the ideal solution (
). The final performance score for each alternative is calculated as its relative closeness to the ideal solution.
The value of ranges from 0 to 1. A higher value indicates that the HRC team performance is closer to better overall performance.
To operationalize the TOPSIS model, raw data must be translated into a consistent numerical scale. This study establishes a four-level grading system for each of the 25 HRC performance indicators: Grade I (Poor), Grade II (Fair), Grade III (Good), and Grade IV (Excellent). For calculation, these grades are mapped to a 1–4 numerical scale. This rubric provides a clear and objective standard for data collection, making the evaluation process transparent and repeatable.
Table 7 details the specific criteria for each grade level.
Based on the evaluation standards of HRC team performance indicators, an initial evaluation matrix was established. The TOPSIS evaluation method was then applied to calculate the relative closeness scores for four levels of performance indicators, which were used to define the HRC team performance evaluation criteria. The detailed calculation results are shown in
Table 8.
According to the calculation results in
Table 8, the HRC team performance evaluation criteria are as follows:
- (1)
Low Performance: ;
- (2)
Fair Performance: ;
- (3)
Good Performance: ;
- (4)
Excellent Performance: .
6.4. Case Study
6.4.1. Case Background
To validate the practical applicability of the AHP-TOPSIS framework, a case study was conducted on a realistic construction scenario. The case involves a human–robot team tasked with interior wall finishing for a new high-rise residential building project in Nanjing, China.
The HRC team consists of one skilled worker and a plastering robot. The worker’s responsibilities are not eliminated but are shifted to higher-value activities. These include (1) setting up the work area and the robotic system, (2) preparing and loading material into the robot’s hopper, (3) performing real-time quality control by visually inspecting the robot’s application, and (4) manually finishing complex geometries such as corners, edges, and areas around electrical outlets that are difficult for the robot to access. This leverages the human’s experience, adaptability, and flexibility. The robot is a fully autonomous wall painting system. It is equipped with an arm, a material sprayer, and a sanding tool head. Its primary tasks are to apply plaster evenly across large wall surfaces and then grind them to a smooth status, thus performing the most physically demanding and repetitive parts of the construction task.
The workflow is designed as a collaborative cycle. The first step is the setup. The worker prepares a room, sets up the mobile robot platform, and fills the material hopper. The second step is autonomous operation. The robot is activated and performs a scan of the room, plans its path, and begins applying plaster to the main wall sections. The third step is worker activity. While the robot works, the worker prepares the next batch of plaster, monitors the robot’s progress, and begins preliminary work on detailed areas. The final step is finishing and transition. Once the robot completes the main surfaces of a room, the worker steps in to manually plaster and sand the corners and edges, ensuring a high-quality finish. During this time, the robot can be moved to the next room to begin its scanning and setup process, minimizing downtime.
6.4.2. HRC Team Performance Data Collection and Score Calculation
Data for the case study were collected through a combination of project documentation review, operator interviews, and direct observation. These data were then scored according to the grading standards established in
Table 7 to generate a numerical score for the 25 indicators. This process forms the initial decision matrix for the TOPSIS analysis. The raw data and corresponding graded scores are presented in
Table 9.
With the initial decision matrix established, the six-step TOPSIS calculation was performed. The weights from
Table 6 were applied to create the weighted normalized matrix. Finally, the Euclidean distances and the relative closeness score (C) were calculated. The result for the case study was C = 0.664 < 0.723. Based on this scale, the overall HRC team performance score of 0.664 is classified as Fair Performance.
6.4.3. Results Analysis and Suggestions
This result indicates that while the HRC team is effective and provides significant benefits over traditional human–human collaboration, there are specific areas with room for improvement to achieve a Good or Excellent result.
The HRC team performance is high in the dimensions of productivity and safety. High scores in P3 (human–robot ratio) and S1 (dynamic separation distance), combined with their significant AHP weights, contributed positively to the overall score. This shows that the system is well-designed from a core efficiency and physical safety perspective. The robot effectively handles repetitive work, and the human–robot ratio is optimized for the construction task. However, the primary areas that constrain the team performance are in the dimensions of flexibility and reliability and collaborative creativity. The indicators F2 (collaborative task reconfiguration time) and F3 (environmental adaptation ability) both received a low score of 2. Due to the substantial weight of dimension of flexibility and reliability, these low scores had a negative impact on the final result. Similarly, the indicators belonging to the dimension of collaborative creativity scored low. The focus on maximizing productivity appears to leave little incentive for workers to engage in creative problem-solving (C1, C2, C3, C4). Consequently, the team generated minimal creative value added (C6), ignoring a key potential benefit of HRC where humans are freed from repetitive tasks to focus on higher-level cognitive work.
Based on this result analysis, the project manager can translate the quantitative findings into improvement strategies. The low score on F2 shows that the task transition is a major drag on HRC team performance. The manager should determine whether the delay stems from insufficient worker training, cumbersome robot setup or teardown steps, or poor site logistics. Possible responses include focused rapid deployment training for the worker, evaluating whether to apply alternative robotic systems that enable more automated mobility, and restructuring the workflow. The low score on F3 indicates that the robot struggles when conditions deviate beyond minor variations, forcing the worker to compensate and thus reducing overall autonomy. To address this, the manager should collaborate with the technology provider to clarify current sensing and decision-making limits and explore software updates or configuration changes that could improve AI-driven problem solving. Indicators of collaborative creativity are also weak, suggesting that the HRC team is perceived mainly as a production resource rather than a source of continuous improvement. Establishing a formal feedback mechanism, such as a brief weekly meeting for the worker to share observations and improvement ideas, can bring actionable insights. Pairing this with small performance-based incentives for adopted ideas that measurably improve efficiency, safety, or quality can improve indicators of collaborative creativity.
Overall, this case demonstrates that the AHP–TOPSIS framework is more than an academic evaluation tool. It can generate a quantitative performance rating to help analyze specific strengths and weaknesses of HRC team performance and enable managers to make decisions to optimize HRC team performance.
8. Conclusions
This paper proposes a new and theory-oriented framework of performance indicators for HRC teams in the construction industry, thereby transferring the evaluation perspective from individual actors to the collaborative team. Based on Cognitive Load Theory, Technology Acceptance Model, and Team Role Theory, the framework is deliberately designed to capture cognitive, technological, and collaborative aspects of team performance that are neglected by conventional assessment approaches. After questionnaire and empirical validation, the final framework contains five performance dimensions with twenty-five indicators. These dimensions are across productivity, safety, quality, flexibility, and creativity and together depict HRC team performance as a comprehensive property rather than a simple sum of individual outputs. In this way, the research successfully fills a gap in construction HRC studies, where previous investigations have tended to examine human and robot performance in isolation. In addition, the AHP method is used to establish a scientific weighting scheme and reveal a prioritization structure that the dimension of Safety outranks that of Productivity, challenging the common perception that robots are introduced primarily to accelerate work. To translate the indicator system into an operational assessment tool, the study couples the AHP weights with a graded evaluation criteria and implements a TOPSIS-based evaluation model, deriving relative closeness thresholds that distinguish between Low, Fair, Good, and Excellent HRC team performance levels. Also, an interior wall plastering case consisting of a plastering robot and one skilled worker is used to demonstrate the evaluation framework’s practical utility. With the application of the AHP-TOPSIS model, the team’s overall performance (C = 0.664) falls in the range of Fair. The case analysis shows relatively strong performance in Productivity and Safety but reveals weaknesses in Flexibility, Reliability, and Collaborative Creativity. This underscores the managerial risk of treating HRC only as a production accelerator rather than as an approach for adaptability and innovation.
From the practical standpoint, the proposed framework supplies project managers and site engineers with a scientific and easy-to-operate instrument for enhancing human–robot teaming on construction sites. By applying the indicator set, practitioners can conduct systematic diagnosis of HRC deployments and find out weaknesses that demand improvement. This team-centric evaluation mechanism ensures that critical factors are explicitly considered, thus reducing the risk that neglecting such factors will undermine overall HRC team performance. At the same time, effective HRC allows human workers to concentrate on complex and creative problem-solving tasks, while robots take on physically demanding or highly repetitive operations. Consequently, the framework functions not only as a benchmark for measuring success but also as a practical guideline for integrating robots into construction teams.
Looking into the future, several research directions are recommended to deepen and extend the present work. First, the framework could be integrated with real-time data streams from IoT sensors and BIM to create a dynamic and continuous performance monitoring dashboard for project managers, such as sensor-driven safety distances, log-based uptime, and digital reporting of rework, to reduce subjectivity and enable near-real-time feedback. Second, studies are needed to track HRC team performance over the entire project lifecycle, using the framework to measure performance degradation and long-term sustainability. Thirdly, as data for validation of this framework comes from China, cross-regional applications are needed to examine how cultural norms, workforce practices, and regulatory requirements influence observed indicator scores. In future, multi-country pilots can be conducted to establish local baselines and threshold calibrations and test the robustness of perception-based indicators across differing safety cultures. Finally, future research could employ methods like structural equation modeling to explore the causal relationships between the five performance dimensions, testing hypotheses such as whether improved safety directly leads to higher perceived quality and creative willingness.