Towards Human–Robot Collaboration in Construction: Understanding Brickwork Production Rate Factors

: This study explores the critical determinants impacting labor productivity in brickwork operations within the construction industry—a matter of academic and practical signiﬁcance, particularly in the era of increasing human–robot collaboration. Through an extensive literature review on construction labor productivity, this study identiﬁes factors affecting brickwork productivity. Data were collected from active construction sites during brick wall construction through on-site measurements and participatory observation, and the relative importance of these factors is determined using Principal Component Analysis (PCA)-factor analysis. The validity of the analysis is established through the Kaiser–Meyer–Olkin (KMO) test and Bartlett’s test of sphericity, with a KMO value of 0.544 and signiﬁcance at the 0.05 signiﬁcance level. The analysis reveals four principal components explaining 75.96% of the total variance. Notably, this study identiﬁes the Euclidean distances for the top factors: weather (0.980), number of helpers (0.965), mason competency (0.934), and number of masons (0.772). Additionally, correlation coefﬁcients were observed: wall area had the highest correlation (0.998), followed by wall length (0.853) and height (0.776). Interestingly, high correlations did not necessarily translate to high factor importance. These identiﬁed factors can serve as a foundation for predictive modeling algorithms for estimating production rates and as a guideline for optimizing labor in construction planning and scheduling, particularly in the context of human–robot collaboration.


Introduction
The construction sector plays a vital role in the economic progress of many countries worldwide, making it essential for socioeconomic growth [1].Construction materials, including cement, bricks, and sand, constitute a significant portion of project costs, with the materials sector being a top contributor to the Gross Domestic Product (GDP) of the construction industry [2].In Uganda, the construction sector's contribution to the country's GDP reached over 13%, with an annual growth rate of 5% [3].With a growing population and an annual housing demand of 200,000 units, the construction industry faces the challenge of meeting the high demand for housing [3].The construction of buildings sub-sector in the industry sector has a sales share of 37%, the largest in the industry sector of Uganda.Small and medium enterprises (SMEs) have been tasked with the objective of significantly reducing unemployment in Uganda and have been earmarked as a critical player in providing decent employment by facilitating household engagement in income-generating activities [4].The National Labor Force Survey (NLFS) 2021 highlights that construction SMEs in Uganda employ about 4.7% [5].However, given the informal nature of the construction industry, this number is likely to be an understatement of the true situation.
The labor productivity at a national level prior to COVID-19 witnessed a weak upsurge but has been projected to continuously diminish into 2020 and the years ahead [6].This projected decline could adversely affect the construction sector, which already faces existing productivity challenges highlighted by project time and cost overruns [1].Katende et al. [7] identified industrialization as a solution to promote productivity in the construction industry by employing modern construction methods that are not labor-intensive with a high return in productivity output.However, construction firms in Uganda are less involved in investing funds in new technologies; this is mostly due to the fact that the cost of these technologies remains high.The construction industry, unlike the manufacturing industry, is void of such technological progress with significant stagnation; this has often resulted in inappropriate working conditions and sub-par human conditions affiliated with technological inadequacies.As such, a combination of human capital with capitalintensive, non-linear technological advances promises autonomy, flexibility, and robot systems cooperation to develop complex, industrialized products for higher productivity, improved occupation safety, and much-needed innovation in the construction industry [8].Malakhov and Shutin [9] highlight that the use of masonry robots reduced building costs 3-7 times based on various project-specific technical and economic factors, reduced construction times, and stable quality products independent of worker competency.However, as pointed out earlier, Ugandan construction has not expressed a sluggish adoption of automated construction techniques; as such, this study focused on the human aspects of human-robot collaboration, seeking to identify the significant factors influencing brick masonry building construction.
Fired clay bricks are a commonly used masonry material in Uganda, particularly for residential houses.However, the use of these bricks often leads to productivity losses due to factors such as inconsistent brick sizes, poor craftsmanship, and rushed construction schedules [4].The brick-laying industry in Uganda exhibits a distinct lack of established formal institutions that offer structured brick-making training.Instead, many brick-makers acquire their skills through hands-on experience that is coupled with the utilization of traditional wooden molds that lack precise calibration, hence inconsistent brick sizes in the market.Furthermore, the in-field kilns used to fire the bricks are deficient in temperature control [5].These issues result in excessive amounts of mortar and plaster required for bonding and achieving an even finish on the walls.Productivity losses are a significant concern in the construction industry as they can increase labor costs and reduce project profitability [6].The measurement and estimation of production rates in brickwork masonry pose challenges due to the unique characteristics of construction projects [1].Although benchmarking is commonly used in Uganda's construction industry to assess labor productivity and production rates, it lacks documentation on the effects of weather, craftsman experience, and crew size on construction productivity.Consequently, the specific effects of these factors on production rates, as well as their relative significance, remain unclear.
In light of these challenges, this research intends to apply factor analysis to the variables influencing brick wall construction production rate.Additionally, it aims to establish a ranking of the relative importance of these factors, thereby highlighting their relative significance.This research is crucial for construction planning, optimizing labor productivity, predicting labor costs and production rates, and effectively scheduling work packages.This proposes the implementation of a PCA-based factor analysis with KMO and Bartlett's test of sphericity as measures for assessing the sampling adequacy of the dataset for factor analysis.The Euclidean distance, a measure of dissimilarity, is used to extract significant factors based on an Euclidean distance threshold of 0.7.

Human-Robot Collaboration
Construction automation is described as the improvement of production and minimization of accident risks by employing robots to perform tasks previously assigned to humans.This has the objective of reducing time periods and achieving better quality using fewer resources [7].Automation has been identified as a solution to increased operations intensity, resulting in effective applications.However, the construction industry has been criticized for its slow pace in the adoption of automated systems, mostly attributed to the difficulty in designing universal and profit-oriented automated systems [8].García de Soto et al. [9] further attribute this slow adoption to the mostly traditional nature with resistance to change, low industrialization of construction activities, poor collaboration and data interoperability, and high turnovers that hinder change implementation.The adoption of robots in brick masonry stems from bricklaying to masonry construction.Several studies have investigated the use of robots in these aspects, with major studies in bricklaying and brickwork construction automation.Harinarain et al. [7] investigated the use of bricklaying robots to build and assemble brick walls with minimal human interactions; however, as of 2021, no construction company had implemented it in South Africa.This study also provides insights into the advantages, disadvantages, and the willingness to adopt automated systems for construction tasks.Malakhov and Shutin [8] conclude that the use of masonry robots in construction projects, though complex, will optimize the construction process in terms of technical, social, and organizational parameters.Mitterberger et al. [10] studied the application of augmented reality in brickwork building construction and developed a custom-built augmented reality system for in situ construction.

Construction Productivity
Production rates and productivity play vital roles in the effective management and estimation of construction projects.Efficient management of production rates leads to improved project efficiency, reduced delays, and successful project outcomes [11].Productivity, on the other hand, focuses on optimizing resource utilization in production activities [12].It is measured as the ratio of output to input in the production process.Labor productivity, particularly in the labor-intensive construction industry, holds significant importance [13].

Production Rate
Production rate is the mean velocity at which construction activities proceed, and it holds great significance in construction scheduling and management, making it important for process scheduling [11].Herbsman and Ellis [14] identified production rates as an essential component of estimating the time duration of contracts when estimated accurately.Accurate and precise estimation of production rates facilitates improved management efficiency, which in turn cuts back delays, enabling the completion of projects on time and within the agreed budget.The determination of production rates is facilitated by site visits, revision of project records, and the use of estimating manuals [15].

Production Rate =
total estimated quantity of the activity Activity Duration (1)

Construction Productivity
Productivity is all about the effectiveness and efficient utilization of resources that are needed for a production activity [12].Productivity is usually defined as the ratio of the output of production to the input of production factors or means and is a major determinant of the success of construction projects.This has led to most researchers focusing on improving construction labor productivity since the construction industry is laborintensive [13].Dolage and Chan [12] identified total factor productivity, labor productivity, and construction productivity as the commonly used productivity measurement methods used by researchers.Calcagnini and Travaglini [16] defined labor productivity as the actual output, such as courses laid per labor hour worked, and it is used to estimate the work output produced within a labor unit.Labor productivity is a quotient of an output quantity and labor, as expressed in Equation (2).labour productivity = output quantity ÷ labour hours (2) Syverson [17] defines total factor productivity as the ratio of total output to total input.The total input includes labor, materials, energy, equipment, and capital.The expression of total factor productivity is shown in Equation (3).TFP = Total Output ÷ ∑ (Labour + Materials + Equipment Energy + Capital) (3)   Researchers employ various methods to measure productivity, including total factor productivity, labor productivity, and construction productivity [12,16].Labor productivity shown in Equation ( 2) assesses the output achieved per labor hour, providing insights into the efficiency of labor utilization [16].Total factor productivity shown in Equation ( 3) considers multiple factors, such as labor, materials, energy, equipment, and capital, to evaluate the overall efficiency of the construction process [17].Evaluating both labor productivity and total factor productivity allows researchers and practitioners to gain a comprehensive understanding of the construction process's effectiveness and efficiency.To determine production rates accurately, site visits, project record revisions, and the use of estimating manuals are commonly employed [15].These practices enable precise estimation of production rates, providing crucial information for construction scheduling, management, and resource allocation.

Factors Affecting Construction Productivity
Extensive research in the field of construction productivity has shed light on various factors that influence productivity, with a particular focus on identifying and quantifying the factors contributing to productivity loss.Lawaju et al. [18]  This categorization provides a comprehensive framework for understanding the diverse range of factors that can impact productivity in construction projects.Another study by Moselhi and Khan [19] further examined the factors influencing productivity and categorized them into three distinct groups.The first category explored weather-related factors, considering the effects of temperature, precipitation, humidity, and wind speed on productivity.The second category focused on job-related factors, such as floor height, work type, and work method, which can significantly impact the efficiency of construction activities.Lastly, this study investigated labor-related factors, including gang size, labor availability, and daily quantity, as key determinants of productivity in construction projects.By identifying and analyzing these factors, researchers and practitioners gain valuable insights into the complex dynamics that influence productivity and production rates in construction.Understanding the interplay between these factors allows for the development of targeted strategies and interventions to enhance productivity and mitigate potential disruptions.

Factor Analysis
Factor Analysis is a powerful multivariate statistical technique utilized to identify logical subsets of variables that are relatively independent of one another within a single dataset.The method is based on the assumption that all variables in the set exhibit some level of correlation and requires the variables to be measured at least at the ordinal level.There are two primary approaches: exploratory factor analysis (EFA) and confirmatory factor analysis (CFA).EFA is employed to examine dimensionality and gather information about the interrelationships among variables during the initial stages of research.On the other hand, CFA is a more complex and sophisticated set of techniques used to test specific hypotheses or theories concerning the underlying structure of a variable set [20].Exploratory Factor Analysis works similarly to a commonly used method called Principal Component Analysis; however, the two should not be confused for one.Exploratory Factor Analysis (EFA) and Principal Component Analysis (PCA) are distinct techniques with different focuses.EFA emphasizes the connections between variables, using covariance to identify factors, while PCA uses variance to identify components.PCA simplifies the dataset by generating principal components and reducing the number of variables.In contrast, EFA reveals underlying constructs and identifies latent factors to explain the data.In summary, EFA explores interrelationships between variables and uncovers latent factors, whereas PCA reduces dimensionality by creating principal components based on variance [21].Factor analysis in construction productivity has been implemented in various research studies.The most common approach to factor analysis has been the use of PCA as the chosen technique.Hiyassat et al. [22], in a case study on construction productivity in the Jordan area, implemented PCA-based factor analysis with a KMO sampling adequacy value of 0.529.This trend of PCA-based factor analysis was further utilized by [13,23] in labor productivity studies.However, PCA is a variation of EFA and should not be mistaken for EFA.There exist potential negative implications of this approach in the literature, and despite that, there is the continued adoption of this in methodological approaches [24].The continued use of PCA over EFA is based on situations when PCA can be approximated to EFA and still deliver results close to those generated by EFA.This approximation is possible when error variances are too small to be considered significant.In addition, most data analytical packages offer PCA as the default option for factor analysis [21].This is not the case with MATLAB r2023a, which offers two built-in functions, namely, the PCA function for PCA and the factoran function for factor analysis.The factoran function calculates the Maximum Likelihood Estimate (MLE) of the factor loading matrix in the factor analysis model [25].The simplest difference between PCA and EFA is based on the fact that, unlike PCA, EFA assumes a model [26].The application of PCA is reliant on the presence of sufficient correlation among the original variables; in contrast, EFA is applied in cases where there is a latent characteristic among the observed variables [21].This study augments Principal Component Analysis, a common approach to factor analysis, with Pearson correlation, a common approach to feature selection used to create a subset of factors.

The Measure of Adequacy/Suitability of Data for Factor Analysis
The Measure of Sampling Adequacy (MSA) is a critical statistical tool used to evaluate the suitability of data for factor analysis [27].In the context of exploratory factor analysis, the MSA provides valuable insights into the correlation structure of variables and their appropriateness for factor extraction.The Kaiser-Meyer-Olkin (KMO) test is one of the most commonly employed methods to assess the MSA [28].The KMO test evaluates the degree of inter-correlations among variables and computes a value ranging between 0 and 1.A value above 0.6 is generally considered acceptable, indicating that the data exhibit sufficient commonalities for factor analysis.Moreover, Bartlett's test of sphericity is another fundamental measure used in tandem with the KMO test [28].It examines the null hypothesis that variables in the dataset are uncorrelated.A significant p-value from Bartlett's test (usually less than 0.05) suggests that the null hypothesis can be rejected, reinforcing the appropriateness of factor analysis for the given dataset [27].Together, the KMO and Bartlett's tests play a pivotal role in the preliminary stages of factor analysis, assisting researchers in ascertaining the robustness and reliability of their data for further exploration of underlying latent structures.However, there are cases where the KMO value falls below the threshold of 0.5.This could be attributed to the small sample size or the presence of multicollinearity in the data.The correlation coefficient in the correlation matrix should be greater than 0.3 to demonstrate the evidence of strength between the variables.Furthermore, the determinant score, a measure for multicollinearity, should be greater than 0.00001; when it is less than this, variable pairs that have correlation coefficients greater than 0.8 ought to be eliminated [20].

Principal Component Analysis
The prevalence of large datasets has become increasingly widespread across various disciplines, including the real estate and construction sectors.This necessitates the development of methods that can effectively reduce the dimensionality of these datasets while ensuring interpretability and the preservation of essential information [29].PCA is one of the most widely utilized techniques to meet this end.It is simply a statistical technique used to reduce the dimensionality of a dataset by focusing only on the data explained by the principal components with the most variability or dispersions.Therefore, its primary objective is to reduce dataset dimensionality while retaining as much statistical information, often referred to as variability, as possible [29].This is accomplished using a transformation that yields a fresh set of variables, known as principal components (PCs), which exhibit no correlation among themselves.This reduces the incidences of multicollinearity that result from various factors being correlated.Moreover, these components are deliberately ordered in a manner that prioritizes the retention of a substantial portion of the total variation inherent in the original variables, particularly among the initial few components, i.e., the principal components are ranked in order of the variance that each of them explains; the first principal component explains the largest fraction of the total variance of the data, the second explains the second largest, and so on [30].
However, when performing Principal Component Analysis, one is often confronted by two dilemmas: the choice between a correlation or covariance matrix for extraction of eigenvalues and eigenvectors, and secondly, the number of Principal Components to select.According to Valle et al. [31], a critical consideration in the development of a principal component analysis (PCA) model pertains to the selection of an appropriate number of principal components (PCs) to represent the system optimally.Optimal selection of PCs is vital, as an inadequate number of PCs will yield a subpar model and an incomplete portrayal of the underlying process.Conversely, choosing an excessive number of PCs leads to over-parameterization of the model, introducing noise and detracting from its accuracy and interpretability.Furthermore, achieving success in Principal Component Analysis (PCA) typically involves retaining a small number of components that explain a significant portion of the variability, typically around 70% to 80%.This criterion of retaining a good proportion of explained variability is widely accepted in PCA practice [32].
The second challenge, according to Valle et al. [31], is the choice between a correlation matrix-based PCA and a covariance matrix-based PCA for Principal Component modeling in statistics.In the process of conducting Principal Component Analysis (PCA), it is crucial to achieve a dimensionless state while simultaneously retaining information on marginal variabilities.Deliberately and artificially excluding marginal variability would shift the objective of PCA towards seeking optimal linear combinations of data that account for variability rather than focusing on variability as conventionally emphasized.According to Jolliffe and Cadima [29], the choice between the covariance matrix-based PCA and the correlation matrix-based PCA depends on the units of measurement of the variables in the dataset.When variables have different units of measurement, the covariance matrix-based PCA may yield undesirable outcomes due to its dependence on unit-specific variance.In contrast, conducting PCA on standardized data using the correlation matrix resolves this issue.However, the correlation matrix-based PCA typically requires more principal components to explain the same proportion of variance compared to the covariance matrix-based PCA.Hence, the correlation matrix-based PCA is advantageous for handling variables with different units of measurement.PCA has been applied in knowledge management studies to assess the factors influencing the success of knowledge management in quantity surveying firms [33], project management studies to assess the project management competencies for the Ghanaian construction industry [34], in sustainable construction to identify barriers to sustainable construction in the US [35].

Pearson Correlation
The Pearson correlation coefficient, a statistical metric within the field of statistics, quantifies the strength and direction of the linear relationship between two variables within a set of variables.It assumes values ranging from −1 to +1, where a value of +1 signifies a perfect positive linear correlation, a value of 0 indicates the absence of a linear relationship, and a value of −1 indicates a perfect negative linear correlation [36].The correlation matrix plays a crucial role in the selection of pertinent features.In the context of this research, emphasis was placed on identifying highly correlated input and output attributes.The primary advantage of correlation analysis lies in the fact that it is simple and easy to implement, providing a quick initial assessment of the potential relevance of features.This is essential as it guides the analyst to focus their attention on the most promising variables.However, it is important to note that correlation analysis has limitations as it only captures linear relationships and may overlook nonlinear associations between variables.This implies that altering the variables through linear or scale transformations, whether for input, output, or both, will not impact the correlation coefficient.The correlation coefficient is further influenced by the range of observations, with larger data result in a larger correlation coefficient.This calls for caution when analyzing sets of data with different ranges of observations.Additionally, correlation does not imply causation and a high correlation does not necessarily indicate a strong predictive relationship.Much as there may be a causal effect of one variable on another, correlation analysis overlooks other explanations, such as the phenomenon of confounding [37].To mitigate these limitations, it is recommended to combine correlation analysis with other feature extraction methods, such as PCA, as a validation benchmark.This allows for a more comprehensive evaluation of feature importance and enhances the robustness of the selected features.Pearson correlation has been utilized in construction studies as a measure of the magnitude of strength and direction of factors in a linear dimension [38,39].

Data Collection
The dataset used for modeling productivity in this study was collected through direct field observations and interviews from residential sites located in the areas of Kampala and Wakiso districts during the period between January and March 2023.A total of 20 sites were visited, employing both stratified and purposeful random sampling techniques to identify a suitable sample size.The study area was composed of the five divisions of Kampala Figure 1 and Kira Sub-County, Wakiso district.Through a reconnaissance visit, the study sites were organized according to their location in these five divisions and Wakiso district and grouped by their parishes.The parishes sampled were chosen randomly, and a minimum of three parishes were chosen per division or sub-county illustrated in Table 1.Based on the sites identified through the reconnaissance visits, one site was chosen randomly from each parish to ensure that materials were sourced from different locations.Extra parishes were chosen for Kampala Central and Kawempe, as they are some of the smallest divisions in Kampala City necessary for statistical significance.
Though sites using concrete blocks were identified during the reconnaissance visits, purposive sampling considering only active sites that were using bricks was used to eliminate sites using concrete blocks as the masonry unit.This study only tracked one operation, brick wall construction, taking into account several parameters such as the number of masons and helpers, the area of the wall constructed, the amount of mortar used for bonding the bricks, and the brick dimensions.
The sites were visited twice a day: first in the morning to ascertain the initial site conditions and in the evening to record the amount of work completed.The collected data were classified into three groups: weather, crew, and project.Data related to mason and helper numbers and wages were categorized as crew data, while data related to wall dimensions, area, and location were classified as project data.The weather category included precipitation and temperature.Data from 10 factors were classified into three categories, as shown in Table 2.The independent variables included the number of masons and helpers, the mason and helper wages, crew competency, weather, and man-hours, and the dependent variables comprised the wall length, height, and area constructed.The cofounding variables consisted of the variables in the project data.The productivity factors tracked were informed by Lawaju et al. [20] with a focus on manpower and project-related factors.Moselhi and Khan [21] added weather-related factors, including humidity and wind; however, this study focused on temperature and precipitation, classifying weather as sunny, cloudy, or rainy.

Naankulabye
Nsambya Central The numbers 1-3 serve to differentiate the parishes.
Though sites using concrete blocks were identified during the reconnaissance visits, purposive sampling considering only active sites that were using bricks was used to eliminate sites using concrete blocks as the masonry unit.This study only tracked one operation, brick wall construction, taking into account several parameters such as the number of masons and helpers, the area of the wall constructed, the amount of mortar used for bonding the bricks, and the brick dimensions.The sites were visited twice a day: first in the morning to ascertain the initial site conditions and in the evening to record the amount of work completed.The collected data were classified into three groups: weather, crew, and project.Data related to mason and helper numbers and wages were categorized as crew data, while data related to wall dimensions, area, and location were classified as project data.The weather category included precipitation and temperature.Data from 10 factors were classified into three categories, as shown in Table 2.The independent variables included the number of masons and helpers, the mason and helper wages, crew competency, weather, and man-hours, and the dependent variables comprised the wall length, height, and area constructed.The cofounding variables consisted of the variables in the project data.The productivity factors tracked were informed by Lawaju et al. [20] with a focus on manpower and projectrelated factors.Moselhi and Khan [21] added weather-related factors, including humidity and wind; however, this study focused on temperature and precipitation, classifying weather as sunny, cloudy, or rainy.This is an essential preliminary step in factor analysis.It evaluates whether factor analysis is necessary before the analyst can embark on it.Two tests are employed, namely, KMO and Bartlett's test of sphericity.

Principal Component Analysis
Principal Component Analysis (PCA) served as the method for conducting Exploratory Factor Analysis (EFA), as depicted in Figure 2. A custom MATLAB code was developed, utilizing the built-in MATLAB PCA function, resulting in four essential matrices for further analysis: the coefficient, score, latent, and explained matrices.The coefficient matrix, also known as the loading matrix, illustrates the contributions of each original variable to the principal components, aiding in understanding their influence.On the other hand, the score matrix contains transformed data, representing observations projected onto the principal components, facilitating data representation in a lower-dimensional space.The latent matrix encompasses eigenvalues, signifying the variance explained by each principal component, thus revealing their significance.Lastly, the explained matrix expresses the proportion of total variance explained by each principal component, helping to assess their respective contributions to the overall variability.Together, these matrices constitute the basis for comprehending the underlying structure and dimensionality of the data, as revealed through PCA.The number of PCs considered was based on the cumulative variance explained.A threshold of 70% variance, as established in the literature review, was used to identify the factors that had a high correlation with individual PCs.A factor loading threshold of 0.4 was considered.Therefore, factors whose loadings were greater than 0.4 were considered to be heavily correlated with the PCs.

Measure Adequacy/Suitability of the Dataset for Factor Analysis
This is an essential preliminary step in factor analysis.It evaluates whether factor analysis is necessary before the analyst can embark on it.Two tests are employed, namely, KMO and Bartlett's test of sphericity.

Principal Component Analysis
Principal Component Analysis (PCA) served as the method for conducting Exploratory Factor Analysis (EFA), as depicted in Figure 2. A custom MATLAB code was developed, utilizing the built-in MATLAB PCA function, resulting in four essential matrices for further analysis: the coefficient, score, latent, and explained matrices.The coefficient matrix, also known as the loading matrix, illustrates the contributions of each original variable to the principal components, aiding in understanding their influence.On the other hand, the score matrix contains transformed data, representing observations projected onto the principal components, facilitating data representation in a lower-dimensional space.The latent matrix encompasses eigenvalues, signifying the variance explained by each principal component, thus revealing their significance.Lastly, the explained matrix expresses the proportion of total variance explained by each principal component, helping to assess their respective contributions to the overall variability.Together, these matrices constitute the basis for comprehending the underlying structure and dimensionality of the data, as revealed through PCA.The number of PCs considered was based on the cumulative variance explained.A threshold of 70% variance, as established in the literature review, was used to identify the factors that had a high correlation with individual PCs.A factor loading threshold of 0.4 was considered.Therefore, factors whose loadings were greater than 0.4 were considered to be heavily correlated with the PCs.

Pearson Correlation
Pearson correlation analysis was conducted using the Microsoft Excel data analysis tool pack.A threshold of 0.5 was set as the cutoff point to identify factors worthy of consideration.Consequently, a subset of factors was selected based on the strength of their correlation with the production rate.By applying Pearson correlation, this study identified and focused on those factors that demonstrated a substantial and meaningful relationship with the production rate, thereby allowing for a more targeted and effective analysis of potential influential variables, with the interpretation of coefficients presented in Table 3.

Pearson Correlation
Pearson correlation analysis was conducted using the Microsoft Excel data analysis tool pack.A threshold of 0.5 was set as the cutoff point to identify factors worthy of consideration.Consequently, a subset of factors was selected based on the strength of their correlation with the production rate.By applying Pearson correlation, this study identified and focused on those factors that demonstrated a substantial and meaningful relationship with the production rate, thereby allowing for a more targeted and effective analysis of potential influential variables, with the interpretation of coefficients presented in Table 3.

Descriptive Statistics
Table 4 shows the descriptive statistics of the collected data, which can provide a summary of the dataset.Descriptive statistics are quantitative measures that offer valuable insights into the central tendency, variability, and distribution of the dataset.

Correlation Method Results
Pearson correlation coefficient is used to examine the strength and direction of a linear relationship between two variables in a database.The correlation coefficient ranges between −1 and +1.A larger absolute coefficient value results in a stronger relationship between variables.In the case of a Pearson correlation, an absolute value of 1 specifies a perfect linear relationship, and a value of 0 indicates a nonlinear relationship between variables.Table 5 shows the correlation coefficients of the study variables.

Suitability for Factor Analysis
The findings from the initial analyses, aimed at evaluating the correlation among the variables under investigation, are presented in Table 6.The extent of correlation of-fers valuable insights into the underlying data structure and can reveal indications of multicollinearity.Multicollinearity arises when the independent variables exhibit strong interconnections, often resulting in a diminished Kaiser-Meyer-Olkin (KMO) value.Consequently, the correlation matrix serves as a means to identify potential multicollinearity, prompting the removal of variables with correlation coefficients exceeding 0.8.Table 7 presents a summary of the results from the KMO and Bartlett's tests.This summary includes the Chi-statistic, degrees of freedom, p-value, and the KMO for two scenarios: one where the highly correlated wall area factor was deleted from the dataset and the other where it was considered in testing for sampling adequacy.

The Result of the PCA Method
Figure 3 illustrates the proportion of variance of each principal component.Based on the overall result, PC1 and PC2 explain 49.54%, which is more than half of the variance explained by PC1 to PC7, corresponding to 93.958% of the total variance.To construct a collection of selected highly correlated features with a high loading contribution on a factor (equal to or greater than 0.4), the highly correlated factors in PC1, PC1 to PC2, and PC1 to PC5 were grouped into feature sets B, C, and D, respectively.Set B consisted of weather and wall height; set C was composed of the number of masons and helpers, wall length, height, area of wall built, weather, and duration.Set D consisted of all input variables.

Choice of Principal Components
The optimal number of PCs was chosen based on a 2D scree plot (Figure 2) and a literature-backed threshold of 70% variance [32].Feature selection when constructing a PCA-biplot is illustrated in Figures 4-7, and the features were chosen based on their relationship with labor productivity.The direction of the feature vector indicates whether the correlations are positive or negative.When a feature exhibits a direction that closely aligns with the productivity vector, displaying the smallest angle between the two vectors, it signifies strong positive correlations.Conversely, when the feature vector points in the opposite direction, negative correlations are indicated.Vectors that are nearly perpendicular to the productivity vector, however, demonstrate weak correlations.

Choice of Principal Components
The optimal number of PCs was chosen based on a 2D scree plot (Figure 2) and a literature-backed threshold of 70% variance [32].Feature selection when constructing a PCA-biplot is illustrated in Figures 4-7, and the features were chosen based on their relationship with labor productivity.The direction of the feature vector indicates whether the correlations are positive or negative.When a feature exhibits a direction that closely aligns with the productivity vector, displaying the smallest angle between the two vectors, it signifies strong positive correlations.Conversely, when the feature vector points in the opposite direction, negative correlations are indicated.Vectors that are nearly perpendicular to the productivity vector, however, demonstrate weak correlations.
Based on the PCA-biplot, the selected feature set E, which includes the number of masons, number of helpers, weather, and crew competency, the summary of the subset of selected features based on PCA is provided in Table 8.Principal Component Analysis (PCA) is a valuable method in multivariate analysis, revealing data patterns.The PCA biplot (Figure 3) combines observations and variables in a reduced-dimensional space showing relationships and similarities.Observations are points, variables are vectors, and their directions highlight variable influences.The proximity between points signifies similarity.The loading plots (Figures 4-6) focus on variables and depict their significance on principal components.High loadings imply strong correlations.Interpreting PCA biplots and loading plots unveils data structures, aiding in variable selection and pattern recognition.The three-dimensional biplot portrays the factor scores and loading in three planes: PC1, PC2, and PC3.However, for ease of interpretability, two-dimensional biplots are provided, as they are easier to visualize the relationship between PCs and individual factors in a 2-dimensional plot.Biplot of PC1 versus PC2 Figure 5, represents the relationship and relative contributions of variables to these principal components.It showcases the directional influence of variables on the plotted components, aiding in the understanding of their correlation and impact within the dataset.Based on the PCA-biplot, the selected feature set E, which includes the number of masons, number of helpers, weather, and crew competency, the summary of the subset of selected features based on PCA is provided in Table 8.Principal Component Analysis (PCA) is a valuable method in multivariate analysis, revealing data patterns.The PCA biplot (Figure 3) combines observations and variables in a reduced-dimensional space, showing relationships and similarities.Observations are points, variables are vectors, and their directions highlight variable influences.The proximity between points signifies similarity.The loading plots (Figures 4-6) focus on variables and depict their significance on principal components.High loadings imply strong correlations.Interpreting PCA biplots and loading plots unveils data structures, aiding in variable selection and pattern recognition.The three-dimensional biplot portrays the factor scores and loading in three planes: PC1, PC2, and PC3.However, for ease of interpretability, two-dimensional biplots are provided, as they are easier to visualize the relationship between PCs and individual factors in a 2-dimensional plot.Biplot of PC1 versus PC2 Figure 5, represents the relationship and relative contributions of variables to these principal components.It showcases the directional influence of variables on the plotted components, aiding in the understanding of their correlation and impact within the dataset.The three-dimensional biplot portrays the factor scores and loading in three planes: PC1, PC2, and PC3.However, for ease of interpretability, two-dimensional biplots are provided, as they are easier to visualize the relationship between PCs and individual factors in a 2-dimensional plot.Biplot of PC1 versus PC2 Figure 5, represents the relationship and relative contributions of variables to these principal components.It showcases the directional influence of variables on the plotted components, aiding in the understanding of their correlation and impact within the dataset.Comparing PC3 against PC2 in a biplot Figure 6, illustrates the interaction and significance of variables in these respective principal components.This visualization Comparing PC3 against PC2 in a biplot Figure 6, illustrates the interaction and significance of variables in these respective principal components.This visualization demonstrates how variables contribute differently to these dimensions, shedding light on their relationships and distinctions within the dataset.
demonstrates how variables contribute differently to these dimensions, shedding light on their relationships and distinctions within the dataset.A biplot of PC3 against PC1 Figure 7, visualizes the interplay and relative importance of variables in these principal components.This graphical representation offers insights into how variables contribute distinctly to these dimensions, revealing their association and impact within the dataset.A comparison between the subset of factors derived from the Pearson correlation and Principal Component Analysis is presented in Table 9.The purpose of this comparison is to gain a deeper understanding of the feature subsets of essential variables influencing labor productivity in brickwork based on the Pearson correlation, PCA, and Biplot Methods of feature extraction.A biplot of PC3 against PC1 Figure 7, visualizes the interplay and relative importance of variables in these principal components.This graphical representation offers insights into how variables contribute distinctly to these dimensions, revealing their association and impact within the dataset.
demonstrates how variables contribute differently to these dimensions, shedding light on their relationships and distinctions within the dataset.A biplot of PC3 against PC1 Figure 7, visualizes the interplay and relative importance of variables in these principal components.This graphical representation offers insights into how variables contribute distinctly to these dimensions, revealing their association and impact within the dataset.A comparison between the subset of factors derived from the Pearson correlation and Principal Component Analysis is presented in Table 9.The purpose of this comparison is to gain a deeper understanding of the feature subsets of essential variables influencing labor productivity in brickwork based on the Pearson correlation, PCA, and Biplot Methods of feature extraction.A comparison between the subset of factors derived from the Pearson correlation and Principal Component Analysis is presented in Table 9.The purpose of this comparison is to gain a deeper understanding of the feature subsets of essential variables influencing labor productivity in brickwork based on the Pearson correlation, PCA, and Biplot Methods of feature extraction.x implies that this variable is an element of that set.
The analysis explores the selection of productivity factors and various sets of features using Pearson correlation and Principal Component Analysis (PCA).Sets A, B, C, D, and E were studied based on the factors that were heavily correlated with the PC based on a threshold of 0.4, with Set D having the highest number of selected features (6).The results shed light on a comparison between Pearson correlation and PCA based on the significant factors impacting labor productivity selected by each method.The factors forming the subset of selected factors were chosen based on the Euclidean distance across the nine PCs in Table 10.The Boolean returned true for factors that scored a Euclidean distance greater than 0.7 and false for factors that did not meet this threshold.The selection of significant factors was determined based on the Euclidean distance of the component loadings.This approach allows us to assess the extent to which individual factors play a substantial role in the variability of the dataset, warranting focused attention.The calculation of Euclidean distance for factor loadings provides a quantitative measurement of the dissimilarity between the factors and the principal components.A larger magnitude of this difference indicates a greater contribution of that specific factor to the dataset's variability, signifying its pronounced significance.In line with this study's objectives, a threshold for selection is typically established.In this study, a minimum cutoff of 0.7 was adopted as the chosen threshold.The scatter plot shown in Figure 8 of the first two principal components (PC1 and PC2) shows that these two principal components can capture the most important variation in the dataset.The clusters are well-separated in the two-dimensional space of PC1 and PC2, which suggests that these two principal components can be used to cluster the data effectively.This demonstrates the effectiveness of PCA for dimensionality reduction and clustering of data.

Pearson Correlation Results
The examination of Pearson Correlation values demonstrates that the number of masons, wall length, height, and area of wall built had the highest association and dependency, based on a threshold value of 0.5, grouped in feature set A, and the other factors had the lowest association.Therefore, it is possible to model brickwork labor productivity as a function of the number of masons, wall length, height, and area of the wall built.

Descriptive Statistics
The results of the analysis revealed valuable insights into the dataset and its relationship with labor productivity.Table 3 provides descriptive statistics, offering a comprehensive summary of the collected data, including measures of central tendency (averages) and variability (spread of data points).The mean number of masons and helpers was three and two, respectively, working for an average daily wage of UGX 31000 and UGX 16750 at a rate of 1.476 square meters per day.This implies that, on average, the ratio of mason to helper was 1:2, meaning for every mason hired, two helpers were employed for assistance.It was frequently noticed that while one helper aided in the construction process, the other helper participated in the continued supply of materials.Notably, the Pearson correlation coefficient was utilized to examine the strength and direction of linear relationships between variables in the database (Table 4).The correlation matrix demonstrated that the number of masons, wall length, height, and area of the wall built exhibited the highest associations and dependencies with the production rate.These factors were grouped into feature set A, indicating their potential significance in modeling brickwork labor productivity.

Adequacy of Data for Factor Analysis and PCA
Table 5 indicates the correlation coefficient of the study variables; the wall area is highly correlated with wall length, exhibiting a correlation coefficient of 0.861.This signifies a high inter-correlation between wall area and wall length.Consequently, the wall area variable was eliminated to remove any bias.Upon elimination of wall area from the dataset, the KMO value of 0.544 indicated a moderate degree of common variance,

Pearson Correlation Results
The examination of Pearson Correlation values demonstrates that the number of masons, wall length, height, and area of wall built had the highest association and dependency, based on a threshold value of 0.5, grouped in feature set A, and the other factors had the lowest association.Therefore, it is possible to model brickwork labor productivity as a function of the number of masons, wall length, height, and area of the wall built.

Descriptive Statistics
The results of the analysis revealed valuable insights into the dataset and its relationship with labor productivity.Table 3 provides descriptive statistics, offering a comprehensive summary of the collected data, including measures of central tendency (averages) and variability (spread of data points).The mean number of masons and helpers was three and two, respectively, working for an average daily wage of UGX 31000 and UGX 16750 at a rate of 1.476 square meters per day.This implies that, on average, the ratio of mason to helper was 1:2, meaning for every mason hired, two helpers were employed for assistance.It was frequently noticed that while one helper aided in the construction process, the other helper participated in the continued supply of materials.Notably, the Pearson correlation coefficient was utilized to examine the strength and direction of linear relationships between variables in the database (Table 4).The correlation matrix demonstrated that the number of masons, wall length, height, and area of the wall built exhibited the highest associations and dependencies with the production rate.These factors were grouped into feature set A, indicating their potential significance in modeling brickwork labor productivity.

Adequacy of Data for Factor Analysis and PCA
Table 5 indicates the correlation coefficient of the study variables; the wall area is highly correlated with wall length, exhibiting a correlation coefficient of 0.861.This signifies a high inter-correlation between wall area and wall length.Consequently, the wall area variable was eliminated to remove any bias.Upon elimination of wall area from the dataset, the KMO value of 0.544 indicated a moderate degree of common variance, supporting the appropriateness of factor analysis.The significant result from Bartlett's test further validated the dataset's suitability for factor analysis, providing confidence in the exploration of underlying latent structures.Principal Component Analysis (PCA) was then performed to explore the dataset's dimensionality and reveal the proportion of variance explained by each principal component (Figure 2).The number of PCs chosen was based on a threshold that explained 70% of the dataset variance, resulting in four PCs (PC1 to PC4) being chosen out of the nine generated.Notably, PC1 and PC2 jointly explained 49.54% of the total variance, with PC1 to PC7 cumulatively explaining 93.96%.Based on factor loadings, feature sets B, C, and D were derived, with a set of selected features comprising the number of masons, number of helpers, mason competency, and weather.These feature sets highlighted the significant variables influencing labor productivity.The PCA-biplot provided further insight into feature selection (Figure 3), with feature set E confirming the importance of the number of masons, number of helpers, mason competency, and weather in modeling labor productivity.

Selected Features
The summary of selected features for different sets, obtained from Pearson correlation and PCA, was compared in Table 9, with the weather and wall height heavily contributing to PC1 with coefficients of 0.576 and 0.524, respectively, as summarized in Table 7.As demonstrated in Figure 2, the first principal component (PC1) explains the most variance in the dataset, while the second principal component (PC2) explains the second most variance.This means that PC1 and PC2 capture a significant amount of the information in the original dataset together.The fact that the clusters are well-separated in the two-dimensional space of PC1 and PC2 suggests that these two principal components can effectively distinguish between the different clusters.This is important because it means that we can use PCA to reduce the dimensionality of the dataset while still being able to identify the different clusters.Figure 9 shows the selection of variables based on their Euclidean distance.
supporting the appropriateness of factor analysis.The significant result from Bartlett's test further validated the dataset's suitability for factor analysis, providing confidence in the exploration of underlying latent structures.Principal Component Analysis (PCA) was then performed to explore the dataset's dimensionality and reveal the proportion of variance explained by each principal component (Figure 2).The number of PCs chosen was based on a threshold that explained 70% of the dataset variance, resulting in four PCs (PC1 to PC4) being chosen out of the nine generated.Notably, PC1 and PC2 jointly explained 49.54% of the total variance, with PC1 to PC7 cumulatively explaining 93.96%.Based on factor loadings, feature sets B, C, and D were derived, with a set of selected features comprising the number of masons, number of helpers, mason competency, and weather.These feature sets highlighted the significant variables influencing labor productivity.The PCAbiplot provided further insight into feature selection (Figure 3), with feature set E confirming the importance of the number of masons, number of helpers, mason competency, and weather in modeling labor productivity.

Selected Features
The summary of selected features for different sets, obtained from Pearson correlation and PCA, was compared in Table 9, with the weather and wall height heavily contributing to PC1 with coefficients of 0.576 and 0.524, respectively, as summarized in Table 7.As demonstrated in Figure 2, the first principal component (PC1) explains the most variance in the dataset, while the second principal component (PC2) explains the second most variance.This means that PC1 and PC2 capture a significant amount of the information in the original dataset together.The fact that the clusters are well-separated in the two-dimensional space of PC1 and PC2 suggests that these two principal components can effectively distinguish between the different clusters.This is important because it means that we can use PCA to reduce the dimensionality of the dataset while still being able to identify the different clusters.Figure 9 shows the selection of variables based on their Euclidean distance.

Conclusions
The analysis successfully identified pivotal factors that influence labor productivity within the realm of brickwork.The investigation has underscored the significant contributions of variables such as the number of masons, number of helpers, mason competency, and weather conditions in shaping productivity patterns, such as the optimal crew size where too many or too few could result in productivity dips.The insights garnered

Conclusions
The analysis successfully identified pivotal factors that influence labor productivity within the realm of brickwork.The investigation has underscored the significant contributions of variables such as the number of masons, number of helpers, mason competency, and weather conditions in shaping productivity patterns, such as the optimal crew size where too many or too few could result in productivity dips.The insights garnered from this study provide invaluable direction for decision-makers aiming to optimize brickwork procedures and enhance labor efficiency.
The utilization of Principal Component Analysis (PCA) to condense a subset of four factors from the original nine, as discerned from PC1 and PC2, has demonstrated the efficacy of this dimension reduction technique.This approach has effectively disregarded extraneous variables such as wall dimensions and work duration, which may have exhibited high Pearson correlation values but lack relevance in labor productivity modeling.This sheds light on the fact that variables with strong correlations might not always hold paramount importance in comparison to other feature extraction methodologies like PCA.

Recommendations
The MATLAB PCA algorithm devised in this study incorporated two techniques for optimal Principal Component (PC) selection: the graphical scree plot and a literature-based criterion that identifies PCs explaining over 70% of the variance.An additional Boolean function based on the average coefficients of individual factors was also employed.However, it is imperative to explore more intricate methodologies to assess the applicability of the scree plot and the 70% threshold.While this study employed Principal Component Analysis for dimension reduction, which led to a subset of variables (k < p) analogous to factor analysis, it is essential to distinguish between PCA and Exploratory Factor Analysis.These two data science methodologies are distinct and should not be mistakenly conflated, a common occurrence facilitated by certain statistical packages.Furthermore, a comprehensive understanding of PCA's prerequisites and relevance is vital before its implementation, thereby necessitating thorough research to determine the suitability of PCA for a given dataset.Professionals in the field should regard the subset of factors identified here as a benchmark when strategizing and scheduling construction activities.This informed approach can potentially contribute to improved operational efficiency and decision-making processes.The results of the selected factors will serve as the foundation for the development of a neural network model.This model will be constructed based on these critical features, offering predictive insights that can guide resource allocation strategies and human-robot collaboration within the construction industry.With the global industry advancement towards construction automation, Ugandan professionals' perspectives on human collaboration and local adoption should warrant future research to inform the sector-appropriate building construction policy reviews.

Figure 1 .
Figure 1.The Five Divisions of Kampala.

Figure 1 .
Figure 1.The Five Divisions of Kampala.

Figure 2 .
Figure 2. The Schematic of the PCA Procedure.

Figure 2 .
Figure 2. The Schematic of the PCA Procedure.

Table 1 .
Parishes from which Study Sites were Sampled.

Table 3 .
Interpretation of Pearson Correlation Analysis Results.

Table 4 .
Collected Data Descriptive Statistics.

Table 5 .
Pearson Correlation Matrix for Input and Output Parameters.

Table 6 .
Correlation Matrix of the Study Variables.

Table 8 .
Summary of Results from PCA.

Table 8 .
Summary of Results from PCA.

Table 9 .
Summary of Feature Selection Comparison.

Table 10 .
Summary of Selected Factors Based on Euclidean Distance.