Big Data Technology in Construction Safety Management: Application Status, Trend and Challenge

: The construction industry is a high-risk industry with many safety accidents. The popularity of Internet information technology has led to an explosion in the amount of data obtained in various engineering ﬁelds, and it is of necessary signiﬁcance to explore the current situation of the application of big data technology in construction safety management. This paper systematically reviews 66 articles closely related to the research topic and objectives, describes the current status of big data application to various construction safety issues from the perspectives of both big data collection and big data analysis for engineering and construction projects, and categorically lists the breakthrough results of big data analysis technology in improving construction safety. Finally, the trends and challenges of big data in the ﬁeld of construction safety are discussed in three directions: the application of big data to worker behavior, the prospect of integrating big data technologies, and the integration of big data technologies with construction management. The aim of this paper is to demonstrate the current state of research on big data technology fueling construction safety management, providing valuable insight into improving safety at engineering construction sites and providing guidance for future research in this ﬁeld.


Introduction
The construction industry is an important material production sector and one of the pillar industries of China's national economy, and it is also a high-risk industry with a high incidence of safety accidents [1]. According to research studies, construction workers are three to six times more likely to be involved in safety accidents than other industries [2]. Construction safety accidents cause injuries, deaths and significant direct and indirect losses to construction workers. Therefore, an in-depth understanding of safety management on construction sites is of great importance and has a profound impact on the sustainable development of the construction industry. Over the last decade, various stakeholders in construction projects have attempted to improve safety in the construction process from a number of angles, such as the implementation of construction safety-related systems and the development of construction safety management systems [3]. Despite the efforts made, the construction industry still faces significant challenges in terms of safety issues [4].
In the era of rapid information technology development, data are everywhere all the time, and a pool of big data with remarkable diversity and complexity is invariably born, referred to as "Big Data" [5]. Due to the large scope and classification of data, it is difficult for people to collect information efficiently and comprehensively if they use manual methods or a single computer device. Instead, they need to use big data technology and some hightech means to collect, classify and technically process data to fully explore the value of data and information. The emergence of big data has prompted various industries to re-examine scientific research methods and has triggered a series of technological revolutions. As a branch of scientific research, the field of construction safety management is also exploring ways to make use of big data [6]. The construction industry is now already dealing with large volumes of heterogeneous data. With the commoditization of technologies such as the Internet of Things and the advent of the cloud era, this figure is expected to grow exponentially [7].
Many scholars both nationally and internationally have already conducted in-depth studies on the application of big data to construction safety. For example, Ayhan and Tokdemir [8] developed a new model using real data collected anonymously from various construction sites to predict engineering construction accidents using a technical approach of potential class clustering analysis and artificial neural networks (ANN) and suggesting necessary preventive measures. In addition, Guo et al. [9] developed a big-data-based worker behavior observation platform by combining traditional behavioral observations with advanced technology to identify unsafe behavior patterns so that strategies and techniques could be implemented to improve safety at their construction sites. Other scholars such as Su et al. [10] proposed a data-driven approach that aims to use convolutional neural network (CNN)-based image recognition techniques to develop automated data-driven fire detection and alarm systems that improve safety at engineering construction sites, allowing for efficient prevention and rapid identification of unexpected emergencies in engineering.
Although scholars have made some progress in their research on the application of big data to construction safety, the research is limited to a single or a few technical means of big data analysis and does not provide a systematic understanding of how existing big data analysis techniques can serve the construction safety field. However, the question of how to extract valid information and create value from big data remains paramount in the face of the sheer volume and greater complexity of data. Despite the increasing research on big data in the scientific community, there is little comprehensive research by scholars on the current state of knowledge. Therefore, the main objective of this study is to review and screen the literature of the last decade of management research on the application of big data to various construction safety issues, to systematize the technical means of big data analysis, and to explore the current status, trends and the application of big data in the field of construction safety from three research topics: big data applied to worker behavior, the prospect of integrating big data technology, and the integration of big data technology with construction management challenges. This paper aims to provide valuable insights into improving safety at engineering construction sites and to provide guidance for future research in this field.
The innovation of this paper lies in the following aspects: (1) At present, there are few papers that comprehensively classify and sort out the big data technologies applied to the field of construction safety, and more studies focus on a specific big data technology applied to a specific construction safety problem. (2) This paper classifies the papers based on four forms of big data collection and big data analysis, and comprehensively summarizes the research status of big data analysis technology in construction safety in four forms: text analysis, audio analysis, video analysis and prediction analysis, which ensures the integrity of the literature review and the scientific and effectiveness of the guidance for engineering construction. (3) On the basis of a comprehensive summary of the existing literature, the paper discusses three aspects of big data application to workers' behavior, the integration of big data technology and the combination of big data with a construction management system, and provides an outlook on future research on the sustainable development of engineering construction safety.
The sections are organized as follows: Section 2 presents the methodology of the paper: the criteria by which the literature was collected and classified, and the keyword cooccurrence presentation of the sample literature through the VOS viewer literature analysis software. Section 3 provides a detailed list of existing technical tools and the current state of application of big data collection and big data analysis. Section 4 discusses three aspects of big data application to worker behavior, the integration of big data technologies, and the integration of big data with construction management systems. Finally, Section 5 contains the conclusion section.

Framework Design of Research Methodology
The aim of this paper is to systematically analyze how the increasing application of big data analytics in engineering construction activities after the big data revolution has had a positive effect on construction safety, and to comprehensively sort out the technical means of big data collection and big data analytics in construction safety. In addition, we discuss what opportunities and challenges are still facing several dimensions of big data applications in the field of construction safety.
Based on this research theme and objective, the target literature will be screened and analyzed, and the detailed process of literature screening is described in Section 2.2. Next, in Section 2.3, the target literature will be analyzed for clustering of keywords in order to understand the current research issues and frontier hotspots in the relevant literature in order to prepare the groundwork for a better classification of the literature on the one hand, and to try to clarify the current research status and find the future frontier issues on the other.

Literature Source
When searching the literature, we first searched the Web of Science (WoS) database using the subject term "Big Data" and found that the literature on Big Data has been around since 1974, and as of September 2021, the number of papers on Big Data has been increasing and has exceeded 140,000 papers. However, of these, the number of articles published under "Big Data + construction" (5095 articles)/"Big Data + engineering (18,542 articles)"/"Big Data + building" (12,573 articles)/"Big Data + architecture" (10,703 articles) search came up with 46,913 references on engineering and construction, accounting for 32% of the literature on big data. Additionally, since the relevant literature was found to be published mainly between 2011 and 2021, we focus on this decade as the main research object, in which the application of big data in construction safety becomes increasingly important and has more reference value. Through extensive literature reading, it was found that some articles did not cover the field of construction safety, so the search was then narrowed down. In order to further refine the literature, we will use various keyword combinations for secondary screening, such as "Big Data, Machine learning, learning, Prediction model, Data mining", and "Construction safety, Construction safety, Worker safety, Construction risk, Engineering risk", as shown in Table 1, and 2128 documents were obtained. Step

Title/Abstract/Keywords Document Type Language
Step1 ("Big data") AND ("Construction" OR "Engineering" OR "Building" OR "Architecture") Research articles/Conference articles/Literature reviews English Step2 ("Machine learning" OR "Prediction model" OR "Data mining") AND ("Construction" OR "Engineering" OR "Building" OR " Architecture" OR "worker") AND "safety" OR "risk") Research articles/Conference articles/Literature reviews English Further manual review was carried out on the following selection criteria while ensuring the quality and relevance of the articles.
(1) Inclusion criteria: The research object of this paper is the current situation of big data service in the field of construction safety. We collect all kinds of big-data-related technical means acting in the field of construction safety, and conduct a comprehensive sorting based on the collection of big data in the field of construction safety and the four forms of big data analysis technology, so whether the research content of the literature is a general overview of the impact of big data on the field of construction safety, or the impact of a separate big data analysis technology on the field of construction safety, or the impact of a separate big data analysis or the impact of a single big data analytics technique on a particular dimension of construction safety, all were included in our literature screening.
(2) Exclusion criteria: Literature completely unrelated to engineering and construction safety was eliminated. Our main exclusion was literature on specific equipment construction, software design and other completely technical aspects, which did not meet the research theme and objectives of this paper.
(3) In-depth search: After manual screening, we obtained 50 papers with high relevance to the research topic of this paper. On this basis, we started to screening the references listed in these papers. Based on the selection of databases in the references and the guarantee of the quality of the articles, we cooperated with Google Scholar to conduct searches related to the research topic. At the same time, during the research process of this paper, the literature will be continuously searched and supplemented in the Web of Science database.
We eventually identified 66 papers that met the research criteria for this paper. The process is shown in Figure 1.
data service in the field of construction safety. We collect all kinds of big-data-related technical means acting in the field of construction safety, and conduct a comprehensive sorting based on the collection of big data in the field of construction safety and the four forms of big data analysis technology, so whether the research content of the literature is a general overview of the impact of big data on the field of construction safety, or the impact of a separate big data analysis technology on the field of construction safety, or the impact of a separate big data analysis or the impact of a single big data analytics technique on a particular dimension of construction safety, all were included in our literature screening.
(2) Exclusion criteria: Literature completely unrelated to engineering and construction safety was eliminated. Our main exclusion was literature on specific equipment construction, software design and other completely technical aspects, which did not meet the research theme and objectives of this paper.
(3) In-depth search: After manual screening, we obtained 50 papers with high relevance to the research topic of this paper. On this basis, we started to screening the references listed in these papers. Based on the selection of databases in the references and the guarantee of the quality of the articles, we cooperated with Google Scholar to conduct searches related to the research topic. At the same time, during the research process of this paper, the literature will be continuously searched and supplemented in the Web of Science database.
We eventually identified 66 papers that met the research criteria for this paper. The process is shown in Figure 1.  This section performs a co-occurrence analysis of keywords in the literature. In general, the number of keywords for co-occurrence analysis should be between 200 and 500. The number of keywords in the literature cited in this paper is 201, which is suitable for keyword co-occurrence analysis. Figure 2 shows the current situation of the application of big data technology in construction safety management keyword co-occurrence, which with the help of the visual analysis tool VOS viewer shows the relationship between each keyword and different colored sets of words. Different colors represent different sets, and the frequency of the keyword being cited is indicated by the size of the legend node. The strength of the relationship between each node is indicated by the distance between the nodes: the more distant the nodes are, the weaker the relationship between the keyword and other keywords, and the closer the nodes are, the easier it is to cluster them to form co-occurring phrases.

Co-Occurrence Analysis of the Literature
This section performs a co-occurrence analysis of keywords in the literature. In general, the number of keywords for co-occurrence analysis should be between 200 and 500. The number of keywords in the literature cited in this paper is 201, which is suitable for keyword co-occurrence analysis. Figure 2 shows the current situation of the application of big data technology in construction safety management keyword co-occurrence, which with the help of the visual analysis tool VOS viewer shows the relationship between each keyword and different colored sets of words. Different colors represent different sets, and the frequency of the keyword being cited is indicated by the size of the legend node. The strength of the relationship between each node is indicated by the distance between the nodes: the more distant the nodes are, the weaker the relationship between the keyword and other keywords, and the closer the nodes are, the easier it is to cluster them to form co-occurring phrases. Based on the results of the co-occurrence clustering, Table 2 summarizes the five main sets of co-occurrence clusters in Figure 2 and shows that "Big Data" is the center of the five keyword clusters and is closely connected to the other four. Of course, it is also the largest in terms of legend size. In group 1: "Big Data analytics", "Cloud computing", "Construction", "Data analytics", "Hadoop" and "Safety" show several existing forms of big data technology and the current state of big data technology for construction safety. In group 2 "Machine learning" and "Activity recognition" as the research and discussion points. The "Machine learning" and "Activity recognition" are representative of the big data technology approaches studied and discussed. In the case of "machine learning", for example, the use of machine learning methods can help to assess safety risks through the input of data and the aid of a series of algorithms. "Processing", "Deep learning" and "Construction workers" in Groups 3 and 4 refine the specific techniques used for big data analytics in the field of construction safety. Many of the articles examine the safety of construction workers in terms of audio and video process analysis. In Group 5, "Motion capture", "Motion recognition" and "Motion sensor" show how motion recognition is an important part of big data analytics applied to construction worker safety. Recognition is a Based on the results of the co-occurrence clustering, Table 2 summarizes the five main sets of co-occurrence clusters in Figure 2 and shows that "Big Data" is the center of the five keyword clusters and is closely connected to the other four. Of course, it is also the largest in terms of legend size. In group 1: "Big Data analytics", "Cloud computing", "Construction", "Data analytics", "Hadoop" and "Safety" show several existing forms of big data technology and the current state of big data technology for construction safety. In group 2 "Machine learning" and "Activity recognition" as the research and discussion points. The "Machine learning" and "Activity recognition" are representative of the big data technology approaches studied and discussed. In the case of "machine learning", for example, the use of machine learning methods can help to assess safety risks through the input of data and the aid of a series of algorithms. "Processing", "Deep learning" and "Construction workers" in Groups 3 and 4 refine the specific techniques used for big data analytics in the field of construction safety. Many of the articles examine the safety of construction workers in terms of audio and video process analysis. In Group 5, "Motion capture", "Motion recognition" and "Motion sensor" show how motion recognition is an important part of big data analytics applied to construction worker safety. Recognition is Buildings 2022, 12, 533 6 of 19 a hot topic in the application of big data analytics to construction safety management. In addition, keyword hotspot maps were also developed by using VOS viewer ( Figure 3). As shown in Figure 3, the research hotspots mainly include big data analytics, deep learning, machine learning, and construction safety. This also confirms that the main research content focuses on "The positive impact of big data analysis technology on construction safety". hot topic in the application of big data analytics to construction safety management. In addition, keyword hotspot maps were also developed by using VOS viewer ( Figure 3). As shown in Figure 3, the research hotspots mainly include big data analytics, deep learning, machine learning, and construction safety. This also confirms that the main research content focuses on "The positive impact of big data analysis technology on construction safety".

Literature Publication Source Analysis
Our analysis of the literature extracted from the paper shows that the publication time statistics (Figure 4) show that from 2011 to the end of 2021, there is an increasing amount of literature related to big data and the application of big data in engineering is becoming more and more important, with the most literature matching the theme of this paper in the journal Building Automation, and more attention can be paid to this journal in the future to obtain more relevant content.

Literature Publication Source Analysis
Our analysis of the literature extracted from the paper shows that the publication time statistics (Figure 4) show that from 2011 to the end of 2021, there is an increasing amount of literature related to big data and the application of big data in engineering is becoming more and more important, with the most literature matching the theme of this paper in the journal Building Automation, and more attention can be paid to this journal in the future to obtain more relevant content.
x FOR PEER REVIEW 7 of 19

Overview of Big Data Technologies
In recent years, the popularity of Internet information technology has led to an explosion in the amount of data available in various engineering fields [11]. At the same time, the expanding storage capacity and continuous advances in computing technology have enabled the information value of big data to be captured in a timely and effective manner [12]. The nature of big data is ambiguous. The nature of big data is ambiguous and requires extensive processes to identify the data and transform them into new insights [13]. Therefore, we need to mine large structured and unstructured datasets using a number of non-traditional data filtering tools to provide useful data insights [14]. Hence, Big Data Management, which analyses and extracts value from the data collected, has emerged. This section will review and analyze the current state of application of big data technology in construction safety management based on the literature that has been screened, in terms of both big data collection and analysis.

Big Data Collection
With the development of computer technology and a range of high-tech technologies, there are more and more means of collecting big data. For example, Han et al. [15] developed a human motion capture framework based on visual recognition technology that can extract 3D human skeletal motion models from live videos and use the motion data to identify unsafe movements of workers. Yu et al. [16] proposed a parametric approach based on an image skeleton to identify unsafe behavior of construction workers in real time, which develops the behavior recognition process by identifying the relevant leading gestures and their parameters, determining a range of standard values for the parameters and proposing an early unsafe behavior recognition method. Alwasel et al. [17] also utilized inertial measurement units (IMUs) and cameras to collect kinematic data from masonry workers and identify posture clusters from them. Yu et al. [18] made a detailed classification of data collection related to construction workers' work posture and evaluated the performance of each posture-related data collection method comparatively to show the advantages of motion sensors and RGB image-based worker posture estimation for safety management.

Overview of Big Data Technologies
In recent years, the popularity of Internet information technology has led to an explosion in the amount of data available in various engineering fields [11]. At the same time, the expanding storage capacity and continuous advances in computing technology have enabled the information value of big data to be captured in a timely and effective manner [12]. The nature of big data is ambiguous. The nature of big data is ambiguous and requires extensive processes to identify the data and transform them into new insights [13]. Therefore, we need to mine large structured and unstructured datasets using a number of non-traditional data filtering tools to provide useful data insights [14]. Hence, Big Data Management, which analyses and extracts value from the data collected, has emerged. This section will review and analyze the current state of application of big data technology in construction safety management based on the literature that has been screened, in terms of both big data collection and analysis.

Big Data Collection
With the development of computer technology and a range of high-tech technologies, there are more and more means of collecting big data. For example, Han et al. [15] developed a human motion capture framework based on visual recognition technology that can extract 3D human skeletal motion models from live videos and use the motion data to identify unsafe movements of workers. Yu et al. [16] proposed a parametric approach based on an image skeleton to identify unsafe behavior of construction workers in real time, which develops the behavior recognition process by identifying the relevant leading gestures and their parameters, determining a range of standard values for the parameters and proposing an early unsafe behavior recognition method. Alwasel et al. [17] also utilized inertial measurement units (IMUs) and cameras to collect kinematic data from masonry workers and identify posture clusters from them. Yu et al. [18] made a detailed classification of data collection related to construction workers' work posture and evaluated the performance of each posture-related data collection method comparatively to show the advantages of motion sensors and RGB image-based worker posture estimation for safety management. A new management approach in big data collection and management is behaviorbased safety (BBS), which is used to observe, analyze and modify workers' behavior in construction [19]. Guo, Ding, Luo and Jiang [9] developed a big-data-based employee behavioral observation platform to collect and identify unsafe behavior patterns to improve safety on its construction sites by combining traditional behavioral observation with advanced technology. The specific implementation is to first establish a knowledge base of behavioral risks and a list of unsafe behaviors to focus on at construction sites, then conduct construction behavior data collection through video surveillance observation of workers' behavior during construction and on-site photo observation, and store the unsafe behavior image data and on-site photos collected from the surveillance video in a big data cloud platform, classify the data, and through a distributed file manager, storing the data. This management approach to data collection not only saves time and effort, but it also does not require human observation of a large number of samples, and does not require the cooperation of workers.

Big Data Analysis
Big data comes from different data sources that have different forms of data and are processed by different organizational entities, thus forming a big data chain [20]. In addition to the sheer volume of data, big data has characteristics that traditional data do not have. Big data includes structured, semi-structured and unstructured data, and unstructured data are increasingly becoming a major part of the data and require more real-time analysis [21,22]. As a result, Lee [23] pointed out that the acquisition, storage, management and analysis of large collections of data that are significantly beyond the capabilities of traditional database software tools cannot simply be performed using traditional data management and analysis techniques, but require special techniques to effectively handle large volumes of data that are tolerated over elapsed time. Additionally, Gandomi and Haider [12] found that discussions of big data have focused on structured data, ignoring the fact that big data often exists in the form of audio, images, video and unstructured text. With unstructured data accounting for up to 95% of big data, there is a need to develop sound and effective analytics to capture the value of the vast amount of heterogeneous data in unstructured text, audio and video formats, which should also include the development of various classification and prediction systems to examine trends and patterns and then interpret the results [24]. Manyika et al. [25] listed 26 common data analysis methods in the 2011 report. The five most frequently used methods in the current research literature are simulation, predictive modelling, optimization, statistics and regression, with around a third of papers using these methods [11]. However, there are about ten big data analytics techniques that are applicable to the engineering and construction safety field. Gandomi and Haider [12] of the study divided these techniques into five categories, with text analytics, audio analytics, video analytics, social media analytics and predictive analytics by type of data structure format. Among them, construction safety management is less involved in social media type of data; therefore, this paper focuses on big data analysis techniques in the field of construction safety and divides them into four categories, namely text analysis, audio analysis, video analysis and predictive analysis, and the specific big data techniques are shown in Table 3.

Text Analysis
Text analysis (text mining) is the technique of extracting information from textual data, transforming unstructured initial text into computer-recognizable, structured information [12].
Common construction accidents are dominated by textual data, and the analysis of construction accident data is one of the most effective ways to improve safety on construction sites. Statistical analysis, as a common textual analysis technique, has been used by many researchers to analyze accidents. For example, Chong and Low [26] analyzed the causes and behaviors behind construction safety problems in Malaysia through the identification study of past statistics and court cases, and with the help of actual statistics, took effective safety measures and remedial measures to prevent and reduce the recurrence of future construction injuries. Using a large amount of previously collected accident data, Xu and Xu [27] used cluster analysis to rank the severity of fatal engineering construction accidents and predict the potential fatalities caused by engineering construction accidents in 2020 based on the GM (1,1) model, the results of which can be used as a basis for preventing engineering construction accidents in practice. Yang et al. [28] developed a statistical analysis of 14,578 construction accidents in Korea based on big data, including specific data on accident causes, worker behavior, injury areas and injury factors, which provides a reference for the improvement of the construction project management system (CPMS) construction operation accident information service content. The issue of worker safety awareness is a major concern on construction sites, as hazardous working conditions are attributed to the dynamic and complex nature of construction sites, making the analysis of accident data in construction sites particularly valuable. Park et al. [29] and Shin et al. [30] examined the intuitive knowledge between multiple attributes of construction accidents expressed in the form of association rules, using 98,189 mass casualty accidents that occurred on construction sites in Korea from 2006 to 2010. Kim and Kim [31] analyzed the causes of construction site fire accidents not provided in the statistics in detail and applied Principal Component Analysis (PCA) to infer seasonal specific factors of construction site fire accidents, providing a solution for the prevention of such workplace accidents. Cheng et al. [32] developed a hybrid model combining a gated circulation unit (GRU) and symbiotic biological search (SOS), named Symbiotic Gated Circulation Unit (SGRU), to assist in the safety assessment of construction projects by using natural language processing techniques to pre-process text data for a priori classification. Li, Zhang, Wang and IEEE [14] used statistical analysis to characterize accidents as a cornerstone for future construction accident prevention. In summary, the textual analysis focuses on the collection and analysis of past accidents and the advance avoidance of construction safety from a preventive perspective.

Audio Analysis
Audio analytics is the technique of analyzing and extracting information from unstructured audio data, transforming unstructured speech information into a structured index [12].
In recent years, there has been an increasing number of techniques to analyze audio data generated in the construction sites of engineering projects. Lee et al. [33] investigated an accident detection system based on audio data, which identifies construction accidents through audio information and provides real-time safety information to workers. Rashid and Louis [34] proposed a method using audio signals and machine learning to identify manual construction safety activities in a modular construction plant. Scarpiniti et al. [35] proposed a Deep Belief Network (DBN)-based audio signal classification method to identify and detect construction safety issues on site through remote monitoring of work activities on engineering construction projects. Scarpiniti et al. [36] proposed a deep recurrent neural network (DRNN) method based on LSTM units to collect and classify audio data recorded at engineering construction sites for hazardous behavior detection and activity range monitoring, and this network can be used in construction sites where a rapid response is required. During construction, the use of construction machinery causes serious damage to underground pipeline networks, affecting people's lives as well as construction safety. Therefore, Wang et al. [37] proposed a new CMC system based on new hybrid acoustic characteristics and proposed two new acoustic feature extraction methods for the collected audio data, acting for the identification and monitoring of construction equipment. The analysis of engineering construction operations based on audio data relies primarily on advanced hardware facilities and targeted software techniques to achieve satisfactory functionality [38]. Vulnerable construction environments require advanced safety monitoring and event detection methods. In order to provide a complementary approach to safety monitoring, Xie et al. [39] built an autonomous audio-based safety monitoring system that uses machine learning techniques to accurately classify sound types based on sound training data refined to project progress and safety data, and to provide early warnings based on any irregularities detected. Park, Cho and Khodabandelu [29] used a sensorbased tracking system to collect and use location data from individuals. A procedural model has also been developed to quantify the potential risk to workers by analyzing and calculating the safety of individual workers. As each operational activity in an engineering construction site generates its own characteristic sound, its signage provides important information about the nature of the operation, the process, the space in which the operation takes place, etc., and safety-related issues. Therefore, Zhang et al. [40] developed a sound recognition technique for engineering construction activity identification and task operation performance analysis, extracting Mel-frequency cepstrum coefficients representing the characteristics of six types of sound data in construction and classifying the sound data using a Hidden Markov Model (HMM) machine learning algorithm, a technique that promises to reliably perform construction sound recognition reliably, making great use of construction monitoring, performance assessment, and safety monitoring methods. In view of this, audio analysis can be used to identify safety problems on construction sites by analyzing sound (both human sound and any sound collected on site) to provide timely or early warning; it can also be used as a reliable means of analysis after a safety incident has occurred.

Video Analysis
Video analytics, also known as video content analysis (VCA), are a variety of techniques for separating the background from the target in a video stream to track, monitor and extract meaningful information [12].
In recent years, an increasing number of intelligent methods have been used for construction management. With the continuous development of image recognition technology driven by big data, construction sites of engineering projects have a more comprehensive intelligent construction [10]. Han et al. [41] proposed a vision-based unsafe behavior detection framework for behavior monitoring, and an experimental study is conducted on motion datasets extracted from videos to detect predefined unsafe behaviors in videos. Han et al. [42] also developed a computer vision-based motion capture technique for motion tracking of 3D skeleton motion models extracted from video, and a motion classification technique for automatic detection of worker movements was proposed. Subsequently, Han, Lee and Peña-Mora [15] also converted the motion data collected from construction workers during ladder climbing to the same space for motion detection with the 3D human skeleton model extracted from the video and the a priori model. Afterwards, Han and Lee [43] captured motion data using Kinect depth sensors to monitor and automatically analyze the construction workers' behavior. Using new big-data-driven technologies, Su, Mao, Jiang, Liu and Wang [10] transformed the traditional engineering construction site management problem into a computer graphics processing problem, introducing a system developed based on CNN image recognition technology for model development, which greatly improves the efficiency of engineering construction safety management through the analysis of construction site processes. Tang and Golparvar-Fard [44] proposed a method for predicting the severity level of individual workers from collected site images and video clips. Worker activity recognition is further improved by a spatio-temporal graph neural network model that uses the identification of worker activity per frame, detection of tool and material bounding boxes, and estimation of worker posture, with great potential for real-time worker safety detection and severity assessment. Video surveillance systems provide a large amount of on-site unstructured image data for worker safety equipment inspections, but there is a need for automated computer vision-based solutions for real-time detection. Therefore, Li et al. [45] developed a video data-driven real-time construction site helmet detection method based on applied deep learning techniques. The deep learningbased model proposed using the SSD-Mobile Net algorithm is able to detect unsafe helmet wearing failures on construction sites with good accuracy and efficiency. Tang et al. [46] proposed an activity prediction framework for construction safety management to model the uncertainty in prediction using a mixed density network (MDN), where a long short-term memory (LSTM) encoder-decoder network is proposed to predict future locations, and the movements of workers and equipment are predicted using the movements observed from previous video data, which has a significant effect on the control of unsafe behavior of workers at construction sites. As can be seen, video analytics allows for more flexibility in analysing construction site conditions and is a more necessary means of analysis than other big data analytics.

Predictive Analysis
Predictive analytics refers to algorithms and techniques based on historical and current data that contain algorithms and techniques that can be used in structured and unstructured data to predict future outcomes [12]. Predictive modelling refers to a set of rules specified in mathematical language formulas so that the observed quantitative relationships and changing trends between things reveal, to some extent, the inherent regularities between things. Han and Wang [47] believed that prediction is at the heart of big data. Predictive analytics techniques can be divided into two categories: some techniques such as moving averages, which extrapolate future developments from historical outcome variables, and another category such as linear regression, which predicts from the interrelationship between outcome and explanatory variables.
As smart construction grows and we collect more and more data, there is a new trend towards big-data-driven and machine learning approaches to help with construction safety risk assessment [48]. Many examples of successful applications using machine learning techniques in the field of construction safety can be found in the literature [2]. The study uses machine learning techniques to analyze 16 key factors that contribute to safety incidents, employing eight algorithms-Logistic Regression, Decision Trees, Support Vector Machines, Parsimonious Bayes, k-Nearest Neighbor, Random Forest, Multilayer Perceptron and AutoML-to make predictions and to assess the effectiveness of combinations of unsafe factors in predicting the severity of construction incidents. Tixier et al. [49] applied two advanced machine learning models, Random Forest (RF) and Stochastic Gradient Tree Addition (SGTB), to provide reliable probabilistic predictions of possible accidents. Zhu et al. [50] applied, validated and compared the effectiveness of different machine learning methods in classifying the severity of construction safety accidents in China. The results are analyzed based on valid prediction models that can reveal some unique safety risk patterns. Dong et al. [51] introduced a machine learning autoregressive network probabilistic prediction (DeepAR) model based on time series and probabilistic prediction into the project to predict slope displacements. It provides good safety control during construction and reduces the number of slope instability incidents at building construction sites. Choi, Gu, Chin and Lee [2] compared the effectiveness of machine learning methods for prediction in construction safety management by using publicly available national data to develop a predictive model for fatal accidents on construction sites, which can reduce the likelihood of fatal accidents and provide more proactive management of construction safety. Abbasianjahromi et al. [52] proposed a framework to develop an innovative machine learning tool, Linear Artificial Bee Colony Planning (LABCP), to generate predictive models for automatically finding and identifying the relationship between safety criteria and project safety performance based on the data collected. By applying the results of LABCP to practical research, the developed model will be able to predict future project safety by measuring the valid performance criteria.
Neural networks are the construction of artificial neural network models that technically simulate certain intelligent activities of the human brain through learning algorithms in order to solve real-world problems. Neural network models are widely used in big data analysis techniques. Artificial neural networks are currently the most widely used machine learning method in engineering construction risk assessment [48]. For example, Lee and Han [53] proposed a convolutional neural network (CNN) to learn unsafe action patterns by simulating workers' unsafe behavior, which can prevent accidents related to falls from heights by continuously monitoring workers at height and providing non-stop feedback. Su, Mao, Jiang, Liu and Wang [10] also proposed a data-driven approach based on convolutional neural networks (CNN) for the development of an automatic data-driven fire detection and alarm system for engineering construction sites using CNN-based image recognition techniques. The method is applicable to a variety of construction environments and is capable of recognizing live fires in real time and building a fire recognition model while maintaining a high level of accuracy. Ding et al. [54] developed a new hybrid deep learning model integrating Convolutional Neural Network (CNN) and Long-Short-Term Memory (LSTM), which automatically identifies unsafe conditions of workers. The learning model is used to (1) identify unsafe worker behavior; (2) capture live video and extract motion data; (3) extract visual features using a CNN model; and (4) rank the learned features of the LSTM model. Ayhan and Tokdemir [8] developed a new model for predicting construction accidents using potential class clustering analysis and artificial neural networks (ANN) and proposed the necessary preventive measures. Using real data collected anonymously from various construction sites, an artificial neural network (ANN) is used to perform the severity analysis of the incidents.

Discussion
This study summarizes big data analytics techniques in the field of construction safety applied to different types of data, but whether there are other types of big data analytics techniques (e.g., inference and induction-based paradigms, abstraction-based paradigms, simplification-based paradigms) that can be applied to construction safety problems is subject to further research. Existing research on classification and identification suggests that innovative applications of big data solutions are not directly related to the construction process, but rather complement and improve the processes of major construction projects [55]. Therefore, many questions remain such as: (1) How can big data be applied to construction worker behavioral safety and what are the challenges? (2) How can big data technologies be integrated into construction safety? (3) How can big data technology be integrated with construction management systems? We will discuss these questions in this study and answer them in the conclusion.

Challenges of Applying Big Data to Worker Behavioral Safety
In construction, worker behavior is one of the main causes of workplace accidents and injuries (which includes not wearing protective equipment such as helmets and seat belts, being struck by vehicles and construction equipment, maintaining unhealthy postures, etc.), with approximately 80-90% of accidents being related to unsafe worker behavior [43]. Therefore, their behavioral safety needs to be given the utmost attention to reduce the incidence of safety accidents [9]. However, this control process is not an easy task as their unsafe behavior exhibits various characteristics and is influenced by numerous factors. Therefore, the development of big data technologies for monitoring construction sites to limit this unsafe behavior remains an unfinished challenge [56].
In the past, observing workers on the job site was more difficult, while the advent of big data technology has provided a reliable and automated means of observing workers. In terms of data collection, Yu, Umer, Yang and Antwi-Afari [18] revealed the widespread use of motion sensors and RGB (red, green and blue) cameras for worker posture-related data acquisition. Alwasel, Sabet, Nahangi, Haas and Abdel-Rahman [17] used an inertial measurement unit (IMU) and a camera to collect kinematic data from the mason. In terms of data recognition, Han and Lee [43] proposed a vision-based unsafe motion detection framework for behavioral monitoring, where a 3D human skeletal motion model extracted from live video and motion data are used to identify unsafe movements of workers. Xie et al. [57] proposed a convolutional neural network-based helmet detection algorithm, through which the model is trained to identify the exact helmet wearing pattern and computer vision techniques are used to check the helmet wearing rate. Sanhudo et al. [58] proposed activity classification of complex construction worker activities using accelerometers and machine learning to guide workers from performing safer activities. Lee and Han [53] proposed a convolutional neural network (CNN) to learn unsafe worker movement patterns, a method that can prevent fall-related accidents by continuously monitoring workers at height and providing real-time feedback. Huang et al. [59] found that the deep learning-based helmet wearing detection algorithm optimizes the prior dimensionality algorithm for a specific helmet dataset and improves the loss function to accurately detect whether the helmet meets the wearing criteria, in combination with pixel feature statistics for image processing, using a deep learning algorithm to monitor construction site safety procedures. However, defining rules and templates for all unsafe behaviors is a challenge. As a result, others have developed artificial-intelligence-based big data systems for construction worker safety management to achieve better safety management with less human input and lower cost resource investment [60]. In summary, the control of unsafe human behavior continues to evolve and remains the primary challenge in the field of construction safety management.

Prospects and Challenges for the Integration of Big Data Technologies
With the progression of science and technology, information technology provides new ideas for the development of the construction industry and the solution of various problems in engineering and construction, and information technology to help the development of the construction industry is bound to be the new direction [61]. Big data, as the blood of information technology, has an extremely important role to play in the development of the construction industry. By analyzing large datasets collected from different projects in different places and under different conditions, we can extract the main factors that jeopardize sustainability, so that they can be dealt with appropriately over time in subsequent projects elsewhere [62]. Fragmented big data technologies have had a positive impact on all aspects of construction safety on engineering projects, but it is our future challenge to integrate big data technologies into a more comprehensive and integrated system.
Many scholars and companies have conducted a lot of research on the integration of big data technologies applied to construction sites. For example, Liu, Hou, Xiong, Nyberg, Li and IEEE [61] built a cloud platform for intelligent construction sites, which is based on a cloud computing platform to realize a technical integration system that integrates information such as mobile data collection, data mining and intelligent feedback pushing for workers by managers, which can provide better supervision services and convenience for managers. Zhou et al. [63] established a metro construction accident database to collect, validate and store 548 records from the metro construction accident database (SCAD) on the basis of six potential data sources. Based on these accident cases and the SCAD coding scheme, in-depth analysis was conducted to identify accident trends and patterns in multiple dimensions such as severity, consequences and causal factors to prevent future metro construction accidents.
Big data is characterized by multi-source heterogeneity, time constraints, spatial and temporal correlation, concurrency and synchronicity. The requirements for an efficient big data collection process in construction projects and the requirements for "big and sparse information" construction site domain knowledge cannot be met by existing data integration and calculation theories and methods, so new theories and methods for data processing need to be set up for multi-scale convergence calculation and dynamic maps of big data, and the need for new types of integration and collaboration of big data technologies is an unfinished challenge.

Combining Big Data Technology with Construction Management Systems
Building construction involves more links and the management of complex content. In all construction there are certain security risks, in the management of the need to use modern technology to enhance the management level and improve the traditional management model, to make up for the shortcomings of management work. At present, big data technology plays an important role in the construction of engineering buildings, and its application to the construction site management, to achieve the wisdom of the construction site management, can bring more comprehensive protection for engineering construction safety.
In pre-construction management, Li, Wei, Han, Huang and Wang [45] used big data video analysis in conjunction with system management to check whether workers were wearing safety equipment. It is also possible to measure and calculate the cognitive state of construction workers based on EEG [22] and use intelligent design techniques to study the application of new technologies in construction from a "human-centered" perspective [64]. In construction process management, audio analysis, video analysis and predictive analysis can be combined with system management to monitor the unsafe behavior of workers during construction [46]. The system can be used to monitor the unsafe behavior of workers [45] and environmental unsafe warnings [65]. This can be achieved through a combination of audio analysis, video analysis and predictive analysis. Choi, Gu, Chin and Lee [2] designed the construction end-of-construction management system to monitor workers' unsafe behavior during construction, and early warning of unsafe behavior of equipment (early warning of environmental safety and better prevention of construction accidents through predictive analysis). In the case of safety accidents in endof-construction management, accident data can be collected and analyzed through big data text analysis [32]. Public health safety in engineering construction should not be ignored either [66], which can all be monitored to some extent in the future by applying big data analysis technology, which will provide data support for the comprehensive development of engineering construction safety systems in the future.

Conclusions
The construction industry is an important material production sector and one of the pillars of China's national economy and is also a high-risk industry with a high incidence of safety accidents. Construction safety accidents cause death, injury and significant direct and indirect losses to construction workers. Therefore, an in-depth understanding of safety management on construction sites is of great importance. Additionally, with the development of computer technology and a range of high-tech technologies in the engineering field, big data management techniques have emerged to analyze and extract the value of the data collected. This paper systematically reviews 66 articles that are closely related to the research topic and objectives and composes and analyses the current situation of the application of big data technology in construction safety management from two aspects: big data collection and analysis, providing valuable insights for improving safety at engineering construction sites and guiding future research in this field.
Specifically, this review compares the current situation of big data applied to various construction safety issues from the perspectives of big data collection and big data analysis in engineering construction projects focuses on big data analysis technologies in the field of construction safety and divides them into four categories, namely text analysis, audio analysis, video analysis and predictive analysis, and lists the breakthrough achievements of big data analysis technologies in improving construction safety. Finally, the trends and challenges of big data in the field of construction safety are discussed in three directions: the application of big data to worker behavior, the prospect of integrating big data technologies and the integration of big data technologies with construction management.
From our research, we conclude that while big data analytics has been used in construction safety for a long time, the more flexible and powerful integration of big data technologies with construction safety has been relatively slow, and since these technologies have broad applicability in many areas, and the big data trend is gradually spreading throughout the engineering and construction industry, with emerging trends such as the Internet of Things, cloud computing and smart buildings, further amplifying its applicability, the integration of big data technologies with construction management systems is a new future trend in construction safety. To our knowledge, this is the first in-depth look at the application of big-data-related technologies to construction safety. In our work we have identified a number of potential application areas and potential technologies that can be combined and applied to construction safety, practically demonstrating that big data technologies can improve the state of the art in construction safety in a number of ways. This work is practical and relevant for all construction researchers and practitioners who wish to exploit big data in the field of construction safety d construction.
Despite the extensive review, this study has the following limitations. Firstly, there are many more big data techniques that can be applied to construction safety and there may be other data collection tools with better performance that could not be included in this review. Secondly, there is more than one factor affecting construction safety both on site, and off site, and other force majeure factors including earthquakes were not considered in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.