Big Data and Human Resources Management: The Rise of Talent Analytics

: The purpose of this paper is to discuss the opportunities talent analytics o ﬀ ers HR practitioners. As the availability of methodologies for the analysis of large volumes of data has substantially improved over the last ten years, talent analytics has started to be used by organizations to manage their workforce. This paper discusses the beneﬁts and costs associated with the use of talent analytics within an organization as well as to highlight the di ﬀ erences between talent analytics and other sub-ﬁelds of business analytics. It will discuss a number of case studies on how talent analytics can improve organizational decision-making. From the case studies, we will identify key channels through which the adoption of talent analytics can improve the performance of the HR function and eventually of the whole organization. While discussing the opportunities that talent analytics o ﬀ er organizations, this paper highlights the costs (in terms of data governance and ethics) that the widespread use of talent analytics can generate. Finally, it highlights the importance of trust in supporting the successful implementation of talent analytics projects.


Introduction
Over the last decade or so, the exploitation of big data (i.e., large volumes of structured and unstructured data generated by the routine activities of organizations) has become very popular among organizations (Mayer-Schönberger and Cukier 2013). The reasons for such an interest are very well rehearsed: the cost of storing data (in any format) has fallen drastically, and at the same time, technology for the production of data (such as sensors and wearables) has become cheap 1 . Simultaneously, the techniques that allow one to manipulate and process data stored by organizations are now embedded within standard software, with the result that practitioners can quickly extract insights from their data and use them to improve organizational performance (McAfee and Brynjolfsson 2012;Brands 2014).
Following the practitioners' interests, academic researchers have started to reflect on the use of big data within organizations and its impact on performance (Davenport et al. 2010;Rifkin 2014). Research in this field has developed along two main lines: first, it has tried to identify ways organizations have used their big data holdings to improve their performance (Davenport et al. 2010). As a result, we now have evidence that organizations tend to adopt new "data-driven" strategic decision-making models across different business functions (Kiron 2013;Beath et al. 2012) because of their enhanced capability to exploit their big data (e.g., Akter et al. 2016;Erevelles et al. 2016;Wamba et al. 2017). Second, research has quantified the impact on business performance of the data-driven decision-making models: an often-quoted study by McAfee and Brynjolfsson (2012) found that businesses that use big 1 The fall in storage costs has been driven by the introduction (and diffusion) of cloud-based systems. data to inform their planning and decision-making functions perform better than those which do not. More specifically, these businesses were on average 5% more productive than their competitors in the same industry.
How do big data (and associated analytical techniques) support the Human Resources (HR) function? Talent analytics 2 (as a separate sub-field of business analytics) has emerged as a suite of methodologies that allow one to identify patterns in workforce data to manage workforce, drive changes (Guenole et al. 2017), and eventually create value (Marler and Boudreau 2017;Boudreau and Jesuthasan 2011;Pease et al. 2012;Boudreau and Ramstad 2007). Talent analytics can help answer key questions (Guenole et al. 2017;Fink and Sturman 2017), such as the following: What is the relationship between training and productivity? How does an organization retain employees? Does an organization's well-being program contribute to performance? Are employees with specific degrees more productive than others? Are permanent employees a better investment than temporary employees?
The benefits of talent analytics in terms of value creation are clear. For instance, if an organization can identify a causal relationship between training expenditure and profitability, then it is possible for the organization to set up a training strategy that may have a quantifiable impact on profitability.
In spite of its potential benefits, the emergence of talent analytics as a separate field of business analytics has been very slow (CIPD 2013;OrgVue 2019). In the 2014 Global Human Capital Trends survey commissioned by Deloitte, surveyed businesses indicated that they understood the importance of building their talent analytics capabilities, but also revealed significant gaps in their current readiness. Equally, in a survey conducted by Falletta (2014), only 15% of respondents claimed that talent analytics played a key role in their organization. Similar results were provided by Lawler et al. (2015) although other reports highlighted that in some industries the use of talent analytics was very common due to the availability of data 3 .
However, all this has changed quickly: for instance, the 2016 report from IBM (2016) on talent analytics found that the use of predictive analytics had increased by 40% over the previous two years. Interest in talent analytics has been driven by a number of factors (CRF Research 2017; OrgVue 2019): a) the wide use of analytics in marketing, finance, and other business functions has led to the increasing awareness that the exploitation of data can help to create value (Davenport et al. 2010); b) the availability of cheap systems has made data collection, storage, and processing straightforward as well as visually appealing (OrgVue 2019); c) the increasing use of metrics among HR teams has led organizations to invest resources into the development of quantitative skills among HR professionals. Additionally, a few organizations highlight that the use of analytics may help them to manage uncertainty (OrgVue 2019) as it can help identify the source of risks, design mitigation strategies (CRF Research 2017), and eventually align HRM practices to performance (CIPD 2013). At the same time, there is a number of critical areas related to the use of talent analytics that need to be addressed before organizations can take full advantage of the opportunities it offers (Kamp 2017): a) The relationship between business performance and talent analytics. Although many claims have been made about the extent to which talent analytics can contribute to business performance, we still lack a theoretical framework that allows one to identify the channels through which the deployment of talent analytics may have an impact on organizational performance. Theoretically, 2 In this paper, we will use the label "talent analytics"-introduced by Talent Management Applications around 2006 (CedarCrestone 2006 Workforce Technologies and Service Delivery Approaches Survey, Ninth Annual Edition) and popularized by Davenport et al. (2010). Talent analytics is different from "people analytics," which was first introduced by Google to describe their data-driven approach to human resources management. 3 The KPMG/EIU report (KPMG 2015) showed that executives from industries such as IT, biotechnology, and financial services widely used talent analytics.
the link between business performance and talent analytics is "fuzzy" because of the difficulties of mapping individual behaviour to aggregate outcomes such as productivity and performance. For instance, some organizations may be interested in predicting business turnover; at the same time, linking business turnover to variables such as employees' engagement or staff turnover may be problematic unless there is a clear theoretical model that links individual performance to organizational outcomes (Levenson 2015;Levenson and Fink 2017). As has been highlighted by some authors, without a theoretical model driving the implementation of the talent analytics projects, there is a risk that value can be destroyed (Angrave et al. 2016;Henke et al. 2016). b) Quality of data. Talent analytics relies on techniques that are borrowed from domains or business functions (such as marketing and finance) that tend to produce very large volumes of data. However, small organizations may not have high-quality HR data and may lack the analytical capabilities to adapt techniques designed for big data to areas where the volume of data is quite small (Cappelli 2017). The result is that translating the findings into business outcomes can be problematic. c) Talent analytics function. Will talent analytics replace traditional HR functions or should they be embedded within HR teams? Each option offers costs and benefits, and at the moment there is no clear-cut answer to this question (Rasmussen and Ulrich 2015). However, the implications for the organization may be rather significant: for instance, if talent analytics is not embedded in HR teams, the domain-specific knowledge may be lost (Rasmussen and Ulrich 2015), and talent analytics may become a senior management tool that is detached from the reality of human resources management (Rasmussen and Ulrich 2015).
Against this background, the purpose of this paper is twofold. First, we want to discuss the opportunities talent analytics offers HR practitioners. While there is not much theoretical analysis around the topic, we will start from the practice and we will explore a number of case studies on how talent analytics has improved the management of human resources and organizational decision-making. From the case studies, we will identify the channels through which talent analytics can contribute to value creation. While discussing the opportunities that talent analytics offers organizations, the paper discusses the critical areas presented above. This is an area that is particularly relevant to HR departments, which by their own nature work with sensitive and personal data. Finally, the paper focuses on artificial intelligence and its deployment in the context of HR management and provides some examples of how it is currently used by some organizations.
The structure of the paper is as follows. Section 2 will introduce the concept of big data and the associated analytical methodologies, while talent analytics is discussed in Section 3. Section 4 discusses what academic research can tell us about the relationship between talent analytics and performance. Section 5 illustrates some case studies of organizations that have used talent analytics to improve their performance. Sections 6 and 7 focuses on the challenges associated with the use of talent analytics, while Section 8 introduces the concept of artificial intelligence and its use in the context of human resources management. Finally, Section 9 offers some concluding remarks.

Defining Big Data and Analytics
Big data is a label commonly used to identify large volumes of data produced by sensors, wearables, and social media platforms. Big data can be of different formats such as structured or unstructured data although it has been pointed out that unstructured (i.e., data whose underlying data scheme is unclear) data are the most common type of big data (Dedic and Stanier 2016). Generally speaking, big data are defined with the help of the 3Vs (Akter et al. 2016) representing volume, velocity, and variety. Volume refers to the amount of data that are produced by various sources such as social media, business transactions, and Internet of Things. The second V represents velocity, which represents the speed at which data is produced, while the third V refers to the variety of formats. Other dimensions of big data have been identified as important (McAfee and Brynjolfsson (2012) and Kwon and Sim (2013)), namely variability and complexity. Variability refers to the frequency of the data (i.e., they can be either daily or hourly or real-time), while complexity refers to the fact that the multiplicity of data sources makes it difficult to work with them because of diverging data schemes underlying the data collection. Most organizations use databases to store complex data. Databases tend to be cloud-based and are very efficient for running reports and visualizing data. Data stored includes employee details, holidays, working hours, rota, timesheets, etc. In the context of big data management, the concept of a "data lake" has become very popular. Data lakes allow companies to store several types of data at a low cost, as they do not require the data to be transformed so as to fit a prescribed data model. Data lakes are very useful for insight discovery rather than for analysis. In terms of data storage, data lakes complement data warehouses, which usually employ a pre-defined data model and therefore can support reporting activities and advanced analytics (usually requiring a data scheme).
Analytics is a term that has become more popular as the concept of big data has gained traction in the business world. While the term tends to be used as a synonym of big data, in reality the two terms are very distinct. Indeed, analytics refers to the methodologies that allow one to analyze big data stored by businesses. Some work has been carried out to classify the different types of analytics businesses can use, and INFORMS suggests there are three types of analytics that are relevant to organizations: descriptive, predictive, and prescriptive analytics. Descriptive analytics explores patterns (Joseph and Johnson 2013) in the data and uses statistical analysis to summarize and visualize data (Rehman et al. 2016). Conversely, predictive analytics is a set of techniques that can be used to predict future outcomes based on the historical data. Machine learning and forecasting models are the key tools for predictive analytics (Joseph and Johnson 2013;Gandomi and Haider 2015;Rehman et al. 2016). Finally, prescriptive analytics relies on optimization, simulation, and heuristics-based techniques to model alternative scenarios and their impact on business outcomes (Evans and Lindner 2012).

Defining Talent Analytics
Organizations' interest in the data on their employees is not new. Although the label "talent analytics" itself is new, in reality organizations have always tried to make sense of the employees' data they store to improve their overall performance. The intellectual antecedent of talent analytics can be traced back to the concept of "scientific management" (Kaufman 2014), although the big push for talent analytics is from the growing interest in evidence-based management i.e., decision-making based on the use of the evidence from multiple sources (Barends et al. 2014). What is especially new (beside the label, of course) is the fact that organizations have the capability of combining a variety of data streams (both internal and external to the organization) and using them to address key questions around recruitment, retention, and human resources management (CIPD 2013; OrgVue 2019). In addition, the use of talent analytics can be used to measure the return of the investment in specific areas, the benefit of a specific compensation scheme, and can therefore provide senior management with key facts that can inform strategy development (Guenole et al. 2017). In this respect, a key benefit of talent analytics is the fact that it eliminates the "gut feelings" that may drive decisions at the senior level (Levenson 2015;Guenole et al. 2017).
However, there is still some confusion on what talent analytics means in practice for HR professionals, mostly because it is confused with other traditional HR activities linked to metrics. Fink and Sturman (2017) suggest that metrics and key performance indicators (KPIs) are useful to evaluate the effectiveness of existing processes. In other words, they can assess how HR teams are performing now (Fink and Sturman 2017). This is different from what talent analytics aims to do: its ultimate goal is to identify patterns so as to predict alternative scenarios that can inform strategic decisions (Levenson 2015). An example can illustrate the difference. Organizations may have policies to increase the ethnic diversity of its workforce. In this context, talent analytics can help identify the set of potential measures that may increase diversity and assess their future impact on turnover for instance. This would be very different from what metrics and KPIs do, as they only capture the present moment (i.e., whether the number of employees from ethnic backgrounds has increased) and cannot assess its impact on future performance. As a subject, Davenport et al. (2010) have identified six sub-fields of talent analytics: Human-Capital Management: In this context, talent analytics is used to measure the workforce's human capital and its productivity; in turn, this may help identify the optimal (in terms of skill mix and knowledge) team configuration.
Analytical Human Resources: Talent analytics can help identify the drivers of performance at the departmental level and their contribution of the overall performance.
Workforce Forecasts: This type of talent analytics allows one to forecast the impact on organizational performance of alternative workforce scenarios.
The Talent Supply Chain: Analytics is used here mostly to make decisions about staffing (and the talent supply chain) as well as the related processes.
However, there are other areas where talent analytics has been deployed effectively: Recruitment. Analytics has been used by HR teams to identify potential pools of candidates in places that have been previously dismissed. This is particularly relevant in the case of vacancies requiring skillsets that are very rare; equally, they may be trying to hire from the same pool of applicants as their competitors. In both cases, organizations may use analytics to identify applicants from alternative backgrounds that can still be endowed with the required skillsets.
Additionally, talent analytics can provide additional information on applicants that can help remove unconscious bias from the recruitment process (CRF Research 2017). This issue is particularly important for organizations that try to recruit applicants from backgrounds they are not used to hiring from. In this case, the use of a broader set of indicators on performance and ability can help organizations to recruit the best fit.
Training and Development. An important role of HR teams is to provide training to the existing workforce (Fink and Sturman 2017). Analytics-based dashboards have been used to develop a personalized training plan for new employees based on their education and job's characteristics. In return, dashboards can produce data that can be used to calculate the return on the training investment. This is a difficult area for organizations as return on the training investment is usually calculated at a firm (or department) level but never at an individual level. However, the possibility of collecting very granular data on individual performance eliminates this problem and allows the senior management team to identify the benefits of the investment in training.
Performance and Compensation. Generally speaking, organizations tend to collect a substantial amount of information on each employee that can be used to build performance management systems that can reward employees based on their individual performance. Talent analytics offers the tools to link individual performance management to compensation with the result that they can map and connect organizational performance to single employees or teams.
Employee Engagement and Motivation. Organizational performance is usually assumed to be linked to employees' engagement. However, organizations may collect data that can be used to measure workforce engagement and design the support that workers may need. At the organizational level, these data can help senior management to understand the drivers of staff turnover and design policies that can improve staff retention.

Using Talent Analytics to Drive Organizational Performance
What is the relationship between talent analytics and organizational performance? Academic research on analytics and performance suggests that analytics (and associated big data) are an example of IT resources that can drive firm's performance, although the channels underpinning such a relationship are not well specified (Groves et al. 2013;Braganza et al. 2017;Wamba et al. 2017). Theoretically, management researchers have tended to rely on the resource-based view (RBV) to identify the link between big data and performance (Barney 1991;Verbeke and Yuan 2013). According to RBV, the sources of competitive advantage need to be identified, and once this is done, it is possible to identify how areas that are typically associated with analytics support competitive advantage. Alternatively, analytics can be considered an example of investment in intangible assets that create value although their contribution to profitability can be difficult to quantify because of their characteristics (Haskel and Westlake 2017;Brynjolfsson and Hill 2000).
Most of the academic discussion on talent analytics and performance (Guenole et al. 2017;Levenson 2015) have used theoretical frameworks derived from strategic management theories (Douthitt and Mondore 2014;Mondare et al. 2011;Rasmussen and Ulrich 2015). The RBV in particular has been widely used for this purpose: in this context, talent analytics is associated with improvements in performance as it is a unique resource that organizations can use to generate competitive advantage. This view has been criticized for two main reasons: first, it does not explain how a talent analytics generates value: for instance, it is possible to identify plenty of analytics projects that have destroyed the firm's value, suggesting that the use of analytics is not a necessary condition for value creation and that in reality additional conditions have to be present so that talent analytics can generate value. Second, the RBV does explain how changing economic conditions may affect the relationship between performance and talent analytics.
As a result, researchers have tried to use alternative theoretical perspectives to analyze the relationship between talent analytics and performance. Aral et al. (2012) for instance have used agency theory and suggest that talent analytics is associated with improved performance as it allows one to monitor staff behavior and to align the incentives of managers and employees. The study identified the resources that have to be combined with talent analytics, and these are ICT and performance-related pay, suggesting that these organizations provide the motivation and opportunity to support employees. In these cases, productivity may increase as well. Some authors have pointed out that some of the talent analytics projects produce mostly cost savings, with the result that their impact on business performance is limited 4 . Case studies can provide some additional evidence on the relationship between talent analytics and performance. Marler and Boudreau (2017) reported on a few papers that have analyzed how talent analytics can improve engagement, which in turn resulted into an increase in sales. Van Iddekinge et al. (2016) analyze the use of social media data to support the selection of job market candidates. Their study suggests that there is no correlation between job performance, turnover, and social media profiles. Finally, Lam et al. (2016) suggest that the benefit from talent analytics is mostly driven by changes at the organizational level it generates.

From Theory to Practice: What Do Case Studies Tell Us?
As mentioned by several authors, talent analytics has become common among firms over the last five years (OrgVue 2019). As a result, the use of talent analytics within HR teams is better understood than it was before (Kamp 2017;Bersin 2012) and a number of case studies are now available to map and understand how talent analytics has been used by a number of HR departments (within large firms at least). Much of the early focus of talent analytics has been on specific HR processes such as recruitment and reporting and mostly to cut inefficiencies; however, this has changed over time as HR departments have started to use talent analytics in other areas, which directly support business performance (OrgVue 2019). Other common applications of workforce analytics include workforce planning and reviewing the effectiveness of reward practices. Workforce analytics seems to have the greatest impact in large organizations that can afford to invest in the technology and expertise required to support analytics (Falletta 2014). For example, a 2016 study by the Society for Human Resources Management (SHRM) found that 79% of organizations with 10,000 employees or more now have data analysis roles within HR. In addition, the best examples typically come from sectors with a strong technical, scientific, or data orientation, such as high-tech sectors, biotechnology, and retail (Falletta 2014). In this section, we highlight some of the current applications of talent analytics and provide some real-life examples of how different types of analytics are used by companies 5 .
Predictive Analytics. Predictive analytics is the most common type of analytics used by HR departments. It has been deployed in several contexts such as modelling staff turnover and employees' engagement.
Staff Turnover. Understanding factors that make staff likely to leave a company is important, as it allows one to plan for recruitment as well as to identify a number of actions that can be taken to retain key staff. Several large companies have used predictive analytics to model staff turnover (so-called "flight risk") using a variety of data such as recruitment data, tenure, performance, location, and so on. These models can be estimated at individual-and/or group-level and tend to return a risk score as well as a number of factors (both individual and organizational), which are significantly associated with the likelihood of leaving the organization. IBM 6 , Nielsen 7 , Unilever, and Experian have done so, and managers have used these models to build retention plans for key staff. Managers at Experian have even used the predictive model to test hypotheses on the optimal team size and to drive the structure of the organization; more importantly, they have decided to customize the model to different areas where they operate. Another interesting application is the one from Cisco, which has developed models of talent flows and looks at where the company's people were hired from and where they went when they left.
Engagement. Another common area of focus for talent analytics is employee engagement. Typically, organizations assume that more engaged staff tends to be more productive, and as a result a number of businesses have tried to model different aspects of employees' engagement and identify its drivers. For instance, E. ON has developed a model for absenteeism, while Shell has modelled the relationship between engagement, leadership, and safety (a key driver of business performance). Shell's research has shown that employee engagement is the single biggest driver of individual performance, and it has established a causal link between engagement and sales in various parts of the business.
Recruitment. The challenge in this area is to identify the best suite of tests that allow one to identify the candidates who can contribute most to performance once they are hired. This is an area that has mostly benefitted from predictive analytics. Rentokil Initial has collected data on performance drivers and used them to devise automated assessment techniques to support the recruitment globally. Another interesting application of predictive analytics to recruitment is from Opower, which was able to determine the number of interviewers in a panel required to ensure a selected candidate was more likely to perform better on the job. Network analysis. This is a suite of tools that allows one to map connections among members of staff and/or teams within organizations. These tools allow one to identify hidden networks that may be major contributors to performance and to take actions that may reinforce connections for instance. Some organizations use a mix of qualitative and social media data to identify these networks. For instance, AB Sugar has been using network analysis for four years to support collaboration among key specialist groups such as chemical engineers and agriculture managers. The company has created communities of practice that share expertise and has used analytics both to set up communities and to measure the value they create. Sentiment analysis. Some companies are using analytics to conduct sentiment analysis to test the "mood" of the workforce and detect early signals of potential issues. Typically, this means correlating public data such as postings on internal social media with other data. JPMorgan Chase analyzes 5 Our case studies have been sourced from the HBR paper and from CRF Research (2017). 6 The company also added data from its sentiment analysis tool called Social Pulse, which monitors posts and comments made by employees on Connections (IBM's internal social media platform). The hypothesis is that an engagement with social media might fall when employees are actively thinking of leaving the firm. 7 This project identified that the first year mattered the most and as a result, employees were invited to meet their management a number of times during their first year to check how they are doing. unstructured data from internal social media platforms and employee surveys to come up with a view at any point in time of the sentiment among the workforce. Unilever conducts sentiment analysis by connecting data from internal surveys with information posted by employees and candidates on Glassdoor (the website where current and former employees anonymously review companies and their management). Hitachi Data Systems is using machine learning to understand employees' reactions to relocation. The analysis included modelling different hypotheses around what retention targets to set, what relocation incentives to offer, implications for the company's diversity profile, and what financial provision to make for restructuring.

What Can We Learn from These Case Studies?
From these case studies, we can identify a number of features that are common to all the talent analytics projects that have been discussed above. First, unlike other types of business analytics, talent analytics projects require a good understanding of the links between business performance and individual/team performance (Levenson 2015). Importantly, the model needs to consider the organization as a system where what drives performance is made explicit together with the channels that can have a positive impact on performance (Kamp 2017). In addition, the model has to make clear how specific HR-related issues (such as retention or workforce planning) drive business outcomes such as profitability or/and productivity growth (Levenson and Fink 2017). In turn, this model allows one to identify the hypotheses that can be used to drive the talent analytics project. This is different from the approach that is traditionally associated with analytics (i.e., looking for patterns in data that allow one to construct hypotheses on behavior ex post), but in reality it is key to the success of talent analytics, as ultimately its purpose is about driving business performance, and it is difficult to fulfill such a goal without a clear understanding of what drives performance. In practice, Levenson (2015) suggests starting from identifying the main business objective(s) to achieve and then identifying the levers through which talent analytics can improve performance. From the experience of the organizations listed above, causal links and hypotheses are identified first and then checked again on the basis of the outcomes (Levenson 2015). This process is summarized in Figure 1. Second, mapping talent analytics' outcomes to organizational outcomes can be tricky. Table 1 lists the HR outcomes of specific talent analytics projects (identified from some of the case studies detailed above) and the potential performance outcomes. While most organizations can agree on each list, identifying the drivers of the organizational outcomes from the list of talent analytics outcomes is definitely complicated: for instance, we can argue that the compensation structure may affect simultaneously productivity, sales performance, and profitability. However, the direction of the impact may be different: for instance, an increase in compensation may reduce profitability in the short term, but the increase in sales performance may be large enough to offset the increasing labor costs in the short run. However, until a model of what drives performance is specified and estimated, these potential relationships and feedback effects will not be clear at all, with the result that the actual impact of talent analytics on organizational performance may be under-or overestimated. Indeed, extracting insights from data is only the first step, and in reality, they are valuable as long as they are translated into actions that can improve business performance (Rasmussen and Ulrich 2015).
Third, most organizations treat talent analytics not so much as a resource but more like a capability with the potential to contribute to value creation. Interestingly this is in line with the existing literature on analytics and performance which use the dynamic capability approach as the theoretical lens through which the contribution of analytics to organizational outcomes can be analyzed (Teece et al. 1997). If we use this theoretical perspective, it is important to recall that talent analytics creates value as long as they are embedded in a firm's key capability (Chen et al. 2012). Examples of capabilities that matter for this purpose include learning capability (as organizational learning has to support the implementation of talent analytics projects across the organization), coordinating capability (as different sections need to coordinate their activities so that talent analytics can create value), infrastructural capability (i.e., data storage capability), and technical capability (i.e., the capability of processing HR data) (Teece 2018;Mikalef et al. 2016).

Challenges
A few authors have identified a number of challenges that organizations may face when implementing talent analytics in their organizations. These include data availability, the analytical skills among HR professionals (Angrave et al. 2016;Levenson 2011;Mondare et al. 2011;Rasmussen and Ulrich 2015), support from senior management (Rasmussen and Ulrich 2015), and access to core IT capabilities (Angrave et al. 2016;Aral et al. 2012;Douthitt and Mondore 2014).
Data availability. Organizations that plan to use talent analytics first have to identify the data they need to support a talent analytics function in their organization (Bersin 2012). Typically, talent analytics requires both employees and organizational (programmes and performance) data. Bersin (2012) has identified some of the potential data that are needed for this purpose: • attendance; • assessments; • performance; • competencies; • engagement; • geography; • job status; • job type; • training; • application source; • diversity.
Some authors have questioned whether these are big data (Cappelli 2017). However, this is not very important, as what matters in this context is access and/or the quality of data. HR teams may not have access to all the data they need, and the truth is that an important component of a talent analytics project is to develop a strategy for data acquisition and identify the data that can be used for the project (Cappelli 2017). These data may be more or less granular (according to the frequency they are collected), and most of the time they are structured, as the data scheme is linked to the employee who is the unit of observation (OrgVue 2019). Importantly, qualitative data are still relevant in this domain 8 , and analytics techniques that can handle both qualitative and quantitative data are very well established. Unstructured data that may give information on employees' performance is usually collected by different teams (for instance, the sales teams in the case of salesforce) and may not necessarily be stored by HR teams (CIPD 2013; OrgVue 2019).
The quality of data and the ease by which new data can be acquired can be a major barrier to the use of talent analytics for some organizations (CIPD 2013). Indeed, some of the HR data may be locked in legacy systems, which make them not very useful if they need to be merged with data from different systems (Douthitt and Mondore 2014). Linked to this issue is the problem of data migration (CIPD 2013). Most organizations may inherit old systems that need to be integrated with new systems, while the data need to be migrated. However, data migration and the update of systems may be costly, and small organizations may prefer to avoid the upgrade given the fact that the cost may overcome the expected (long-term) benefit of the investment. As a result, relevant data for talent analytics can be stored in different databases that may or may not be compatible.
In addition, the nature of data stored by HR teams imposes a number of constraints on their portability across different parts of the organization. For instance, multinationals are aware that different countries have different legal regimes around the possibility of sharing data on human resources. In other words, "data silos" limit the capability of using talent analytics (CIPD 2013). According to a report from CIPD (2013), there are several types of data silos that are relevant in this field. The first of these silos is structural and is due to the way organizations are structured. For instance, HR teams and operations tend to be two different teams that may share only limited amounts of data. In small organizations, this may be less of a problem, as size implies that it is possible to identify a way around these constraints (CIPD 2013). However, this type of data silo may be important in the case of large organizations. Needless to say, the problem is exacerbated in the case of teams that operate across different locations. A second type of data silo is created by the different reporting requirements that teams have for different projects, with the result that data produced by different areas of the organization are not compatible (CIPD 2013). Clearly reporting lines and management structure limit the flow of data across an organization, with the result that talent analytics cannot be exploited effectively. The third type of silo that is important to consider in this context is the one created by the need to separate databases for security reasons (CIPD 2013). Indeed, organizations may decide not to merge different types of data if some of these are sensitive and need additional extra protection. Therefore, they are kept in separate servers and are subject to different levels of security. Another important aspect in this context is the data governance, which is well beyond data consent and it is really about how the data will be managed and for what purpose (Cappelli 2017). Data governance should provide a clear framework that guides the use of the data and of the insights well beyond the HR team.
Finally, systems matter (Aral et al. 2012). IT is an enabler but at the same time, it needs to be accessible and easy to understand. In turn, this implies that the capabilities for the use of talent analytics tend to be built in an organic way, implying that organizations develop analytics capabilities as they try to solve business problems (Canlas 2015). Some consultancies have developed maturity models that describe the processes that organizations follow when they implement analytics. The most commonly cited model is the one from Bersin (2012), shown in Figure 2. 8 Data on personality traits are usually considered to be important in talent analytics projects. In addition, behavioral and interaction data (sourced from sensors or wearables) tend to be textual and qualitative data, which need to be analyzed with specific tools.

Skills.
Skills is an area that may be challenging for the development of talent analytics. A shortage of IT skills is the most common reason for not using talent analytics. However, it is not just computing skills that matter; the capability of using the outcomes of the analysis by connecting them to business outcomes is also needed (Bassi 2011). This last one is a critical capability upon which the success of talent analytics hinges. At the moment, we do not have a list of core competencies required by talent analytics. However, Levenson (2011) provided an initial list of analytical competencies needed to perform talent analytics. It included basic multivariate models, advanced multivariate models, data preparation, research design, and quantitative data collection and analysis.
Needless to say, the requirements to perform talent analytics have changed over time: machine learning has become another key skill as well as a basic understanding of natural language processing. Unsurprisingly, the supply of these skills is limited. This is not something that is specific to the HR profession, but it is really linked to the general shortage of these skills across the population. Finally, Rasmussen and Ulrich (2015) point out that HR professionals do not tend to be trained in general business management, with the result that they cannot link insights from talent analytics to business outcomes. Overall, a survey on data analysis skills by the Society for Human Resource Management (2016) showed that 59% of organizations expected an increase in positions needed in a five-year timeframe towards 2021. The majority of organizations would recruit for mid-level management or non-management individual contributors rather than entry levels; three out of five organizations have a need at senior and executive levels (ibid.). The focus of competence is on moderate skills (83%) that are likely to require a Bachelor's degree (ibid.). The survey concludes that demand will increase over a ten-year period. Demand will thus outstrip supply; skills shortage is to be expected, and more planning will be required internally by organizations. In the realm of HR, many staff members work as HR generalists and do not need a specific prior background or degree to enter their HR career but build the latter through on-the-job accreditation courses (e.g., via the CIPD in the UK). In terms of the opportunity for learning necessary data analysis skills, professional courses are being offered on different pathways matching the HR job levels (CIPD, UK). However, it will be universities playing an increasingly important role by offering programs that develop the most often desired set of skills as above (e.g., pathways in business analytics). Regarding HR professionals, as already shown in Section 4, the training priority will be satisfying the demand of large organizations. Given that 98% of organizations surveyed by the SHRM would recruit full-time staff rather than part-time or other contracts, the latter looks encouraging for future employment in the area.
Another issue is the organizational fragmentation in terms of IT provision. This is simply due to the fact that most IT applications are unique to each department. All this reinforces the organizational silos that prevent departments from sharing data. Separate departments use separate IT systems, all of which require recording data using standards which may not be compatible with those employed by another division. This encourages those teams to work in different ways and develop skillsets that may be useful for a specific computing environment but not necessarily suitable to different ones.
Building a talent analytics taskforce. A bigger (related) question that organizations interested in talent analytics need to answer is whether the analytics function resides in the HR team or in the senior management team (Holley 2013). This is a very difficult question to answer and in reality, there is no right answer. There is an argument that talent analytics may inform choices at the senior management level but still the talent analytics function may need to be managed by the HR team (Rasmussen and Ulrich 2015). Some organizations prefer to have a separate team for analytics that includes talent analytics: the team tends to produce dashboards and analytics outputs and act as a service provider to other units; this way they avoid data silos that prevent data sharing (Holley 2013). Alternatively, the team may be sitting within HR teams and design specific evidence-driven solutions that can improve the business performance. In this case, the team has to be able to identify what to prioritize in terms of business areas but must also have a key role at the strategic level.
Organizations need to consider a number of questions when deciding on the structure and reporting lines of the workforce analytics team. There are advantages and disadvantages to having a specialist team within HR (Rasmussen and Ulrich 2015). The benefit of this choice is that there is specific domain knowledge that needs to be applied so that the results can be interpreted correctly. However, the risk is that its activities do not inform strategic decision-making (Green 2017).
In addition, it is important to clarify the purpose of the team and its position in the organigram of the organization (CRF Research 2017;Holley 2013). In this context, relationships with other functions are also important (Rasmussen and Ulrich 2015). Building strong connections with other analytics teams, particularly in finance, marketing, customer service, and operations is important (Green 2017). Talent analytics projects require linking multiple data sources from inside and outside HR systems (Green 2017). Needless to say, it can be done easily if the talent analytics team is part of a wider analytics function in the organization, as HR teams may not be able to access data on business performance (OrgVue 2019). A specialized team is required, as they may have the capability of understanding what drives motivation and behavior, unlike business analytics groups.
Another important area for the development of a talent analytics team is identifying key stakeholders who are pivotal in ensuring that recommendations from talent analytics projects are implemented (Green 2017). Examples of stakeholders include the HR director, HR business partners, and the board team (Holley 2013). A few papers have highlighted the importance of having support from the senior management and other key stakeholders. The reason for this is that talent analytics threatens the role of intuition in decision-making (Falletta 2014) and may threaten the tacit knowledge that is linked to management (Falletta 2014). Indeed, Rasmussen and Ulrich (2015) have noticed that managers tend to reject evidence that is against their beliefs and that, on these occasions, they will prefer their beliefs. In other words, there is a need to develop a framework for the correct use of talent analytics and its insights and this framework needs a contribution from other business functions. We will introduce our own proposal of an ethical framework here that can be inclusive of different actors and stakeholders. However, before we can proceed to that point, we need to consider the crucial role of artificial intelligence (AI) for new developments in talent analytics.

AI and Talent Analytics
In the previous sections, we have focused on analytics and how it can support the management of human resources. In some sense, we have charted past trends in human resources management, as in reality most organizations are interested in what AI can offer their HR teams (Das;Melder 2018). AI aims at having machines mimic what human intelligence can do (Acemoglu and Restrepo 2017). At a basic level, AI tries to use computers to perform (usually repetitive) tasks faster than humans; in this respect, automation is an important component of AI (Acemoglu and Restrepo 2017). Of course, there are other aspects of AI that matter: for instance, deep learning and machine learning are relevant and underpin most of the AI technologies we use in everyday life (Das). Machine learning is a label for a number of algorithms that allow one to either uncover data patterns (through unsupervised machine learning) or predict outcomes (using supervised machine learning). In the former case, algorithms learn from a training dataset, and the models make predictions using the training dataset; typically, regression and classification analysis are two types of supervised machine learning models. In the case of unsupervised machine learning, there is no training data and the algorithm decides what should be the input and output of the model. An example of such types of models is clustering analysis.
AI has started to be used by HR teams and mostly within recruitment teams to automate repetitive tasks (Melder 2018;Miller-Merrell 2016). In the context of talent analytics, unsupervised machine learning can be useful to support recruitment. In this context, machine learning can be used to analyze social media posts and identify characteristics of the applicants, which may not be declared on the CV. Unsupervised machine learning can be used to recommend jobs based on previous searches and posts (LinkedIn is an organization that has used this approach to job recommendation). However, if the purpose of the talent analytics project is to quantify the contribution of the training investment to business volume, then a supervised regression model may be an option, as there are training data that can help one to choose the outputs (business volume) and the inputs (training and other controls) of the process.
IBM has built a number of "cognitive" talent applications based on the Watson machine learning platform (IBM 2016). Blue Matching uses artificial intelligence to match the skills of individuals with internal job opportunities and development programs. The algorithm can also spot opportunities that individuals may have overlooked or felt they were not qualified to do. Tools such as Blue Matching are used to improve workforce planning, and IBM can also use the system to "push" job opportunities or training programs, thereby encouraging people to develop skills it needs for future growth.  support IBM's "entire talent lifecycle" (p. 8), including attraction and recruitment, for employee engagement, retention, development, and growth and for services to support ongoing interaction with employees. Indeed, AI is used to source candidates, screen resumes, and match candidate to existing slots (Miller-Merrell 2016). More sophisticated AI tools allow organizations to use video interviews and scrutinize AI facial expressions to understand potential engagement (Das). Unsupervised neural networks can analyze interviews recordings and themes in employee surveys. Finally, a virtual assistant can help with employee onboarding. AI offers a number of opportunities to develop customized training plans based on specific interests and attention spans. Similarly, AI can facilitate internal job search and facilitate career progression (Das). Equally, AI may facilitate redeployment and outplacement in such a way they are not damaging to employees.
Yet, given all actual and potential benefits shown, ethical questions towards uses of AI for HR are in their infancy compared to the fast developments generated by attractive applications. Companies are aware of some implications and work towards creating user and designer awareness. For example, IBM  stresses that their managers should be put into the position to override AI's instructions if desirable or necessary and that the reduction of "bias," the enhancement of diversity, and the fairness built into AI systems should also be considered in the design. The overall tenet is that AI capabilities can and should be fair and transparent (ibid.: 30), although IBM remains vaguer about how it could be achieved. We contend that it is important to look beyond company-based approaches such as IBM's. For example, the European Commission's AI High-Level Expert Group currently is piloting ethics guidelines for "trustworthy AI" (European Commission High Level Expert Group on Artificial Intelligence 2019). The framework stresses three components: AI must be "legal, ethical, and robust" (ibid.: 5). Differently from company-based approaches such as IBM's, it underlines the institutional and legal contexts; the need to prevent harm especially for vulnerable people and the wider societal risks such as the impact of AI on democracy and procedural justice. Our own suggestion of a framework next does not focus on AI's ethics, but it is compatible with the pursuit of more nuanced ways to approach ethics in the use of big data for talent analytics.

A Framework for the Ethical Use of Big Data in HR
The exploitation of big data offers many opportunities as they allow data to be matched and linked to identify undiscovered patterns. Although the use of big data has the potential to change the way human resources are managed, there remains grey areas that concern the use of big data in the context of the HR function. Workforce data is highly sensitive, and organizations need to exercise great care in deciding what data to collect and what to do with it (Cappelli 2017). For example, it is possible to monitor the content of people's emails, but most organizations would shy away from what they would consider a highly intrusive practice. Increasing amounts of data are available from sensitive sources such as wearable technology and mobile phone records, placing an even greater responsibility on employers (CIPD 2013). Importantly, the GDPR limits the capability of the employer to use personal data for purposes not identified at the moment of data collection. The GDPR requires the employer to be kept informed of whether the data are collected and how they will be used. In addition, personal data can be processed only if the use is compliant with the original purpose that motivated data collection. Indeed, processing data for another purpose requires the employee's explicit consent. Finally, when personal data are no longer needed, they should be deleted from the server, implying that HR databases should be reviewed and cleansed periodically.
It is usually argued that talent analytics is not very different from customer analytics, for instance, which focuses on customer data. While in theory, this is correct in the sense it provides managers increased knowledge of how the workforce works, in reality a number of ethical issues arise when dealing with data on personnel. First, by their own nature, data collected by the HR team tend to be sensitive and personal and contain information that may not be divulged (Chen et al. 2012). Second, these data are collected in the context of a contractual relationship that may be skewed in favor of the employer; in other words, while customers may choose not to share their preferred purchases with retailers, in the context of a workplace, employees may not have any choice and in some cases may not be aware of the fact the data are collected. In addition, data are being collected at the fine granular level from social media channels, websites, and mobile communications sources and are combined with data that the HR team stores, with the result that the privacy of employees is compromised. Zwitter (2014) points out that free will and individualism are compromised by the use of big data in the workplace. In addition, relying too heavily on talent analytics can make employees feel as if they are reduced to just numbers rather than individuals. Given these ethical issues, how should talent analytics be used in an organization? Importantly, legal standards vary from one country to another (CRF Research 2017). Richards and King (2013) argue that existing privacy protections only focus on securing personally identifying information and therefore may not be enough, as they do not protect workers from the misuse of data within the workplace. What is really required is the development of an organizational framework for the ethical use of personal data on workers; the framework should be based upon the four principles of privacy, confidentiality, transparency, and identity (Richards and King 2013), which should give workers agency with respect to the use of data within the organization. As analytics functions mature, having a robust governance framework becomes more important (CRF Research 2017). So far, governance has focused primarily on data issues, such as developing data standards and policies for handling sensitive personal data (CIPD 2013). Increasingly, as well as looking at issues such as privacy, ethics, and consent, organizations are going to have to engage key business stakeholders in defining the purpose and the boundaries of the talent analytics function. Several bodies (see for instance CIPD, 2013) have recommended establishing a small decision-making body made up of representatives of business leaders, user communities, data suppliers, and technical staff. Some jurisdictions require organizations to consult with employees and to identify the benefits to employees of data sharing. A small minority of organizations have already established a governance body for talent analytics (CRF Research 2017). For example, JPMorgan Chase has a workforce analytics steering committee made up of those members of the group-level HR operating committee who have enterprise-wide responsibility for an element of HR, such as HR strategy or technology. All these initiatives highlight the importance of trust (from employees to senior management) in supporting talent analytics projects and the need to establish mechanisms that can create and foster trust.
We thus propose a framework that looks at talent analytics as arising in an emergent social space with different actors involved. The link between big data and HR cannot be managed, governed, and sustained in any single place and thus is fundamentally distributed in nature. This requires an alternative view of ethics in which to position talent analytics, in order to both identify and recognize the participants beyond the companies and "usual" stakeholders. The sense of responsibility invoked is about the actual process of engagement with others and with inherent difference (e.g., Clegg and Rhodes 2006). Our framework point towards a collective ethics in the shared space of action. Differently from individual-based ethics, it stresses what we may come to share based on mutual recognition and responsibility within a community (Nancy 1993). This would allow participants to both support and question the values and standards put forward for choosing, for example, when to use predictive analytics. It would also allow for incorporating moments in which to "suspend judgements," thus avoiding premature or hasty decisions between different stakeholders. Thinking of interactions and negotiations this way creates an awareness about the increased need for "opening up spaces of belonging" (Nocker 2017, p. 230), for distributed leadership (e.g., Bolden 2011) and for generating a more inclusive space of action in talent analytics and in the management of talent.

Conclusions
This paper has explored the relationship between big data and human resources management. Our premise is that big data may offer a number of opportunities to HR practitioners. Like any other organization, HR teams produce volumes of data while undertaking their routine activities; these data can be used for the development of standard HR metrics but in reality, they can be exploited in several ways to provide information on how to generate value via human resources management. The paper has therefore provided a broad overview of talent analytics and its use in a number of organizations.
One of our key insights is that if used in a proper way talent analytics may help the senior management team of an organization to align HR strategies to value creation. The key contribution of talent analytics to value creation is in the fact that it allows one to exploit individual level data and help design measures that can support employees in a personalized way. The case studies we have analyzed have shown that large organizations use talent analytics to address a number of standard HR issues (such as retention, planning, and engagement). However, we still do not have a strong theoretical framework on how talent analytics creates value. This effort is clearly hampered by the lack of data, as most of the data on talent analytics projects belong to the organizations that have sponsored projects that may not be interested in evaluating the evidence. This is not ideal given the fact that case studies provide evidence of a positive correlation between talent analytics and organizational performance. Indeed, if organizations need to have robust evidence on the impact of talent analytics, it is important that another alternative approach to data collection emerges.
The existing practice around talent analytics suggests there are three important factors that moderate the relationship between performance and talent analytics. These include technical knowledge of analytics, access to data, and a good understanding of how to use the results of the analysis to improve performance. However, additional research is needed to quantify the extent to which these factors influence the relationship between performance and talent analytics and more importantly what strategies organizations may put in place to reduce the adverse impact that each of these factors may have on organizational outcomes.
Finally, this paper has analyzed the need of an ethical framework for the use of talent analytics within organizations. This is an area that is sometimes forgotten when working with personal data and conflated into discussions about consent and privacy. In reality, these issues go well beyond consent: talent analytics is usually developed in a context where there is an asymmetric distribution of power between the data owner and data processor. In addition, the contractual relationship that underpins the talent analytics project may be designed in favor of the employer rather than the employees. For these reasons, trust in the way data are used by HR teams is important for talent analytics projects to flourish and be of some use to businesses. The implication is that the development of talent analytics needs to be accompanied by the development of a number of mechanisms that can foster trust between senior management and employees.