Next Article in Journal
Changes in SME Business Due to COVID-19—Survey in Slovakia and the Czech Republic
Previous Article in Journal
The Cost Efficiency and Competition Relationship: Evidence from Saudi Arabian Banks and Non-Structural Approaches to Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting the Productivity of Municipality Workers: A Comparison of Six Machine Learning Algorithms

1
Department of Management Studies, Graphic Era University, Dehradun 248002, India
2
College of Administrative and Financial Sciences, Saudi Electronic University, Riyadh 11673, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Economies 2024, 12(1), 16; https://doi.org/10.3390/economies12010016
Submission received: 25 November 2023 / Revised: 1 January 2024 / Accepted: 4 January 2024 / Published: 12 January 2024

Abstract

:
One of the most significant areas of local government in the world is the municipality sector. It provides various services to the residents and businesses in their areas, such as water supply, sewage disposal, healthcare, education, housing, and transport. Municipalities also promote social and economic development and ensure democratic and accountable governance. It also helps in encouraging the involvement of communities in local matters. Workers of Municipalities need to maintain their services regularly to the public. The productivity of the employees is just one of the main important factors that influence the overall organizational performance. This article compares various machine learning algorithms such as XG Boost, Random Forest (RF), Histogram Gradient Boosting Regressor, LGBM Regressor, Ada Boost Regressor, and Gradient Boosting Regressor on the dataset of municipality workers. The study aims to propose a machine learning approach to predict and evaluate the productivity of municipality workers. The evaluation of the overall targeted and actual productivity of each department shows that out of 12 different departments, only 5 departments were able to meet their targeted productivity. A 3D Scatter plot visually displays the incentive given by the department to each worker based on their productivity. The results show that XG Boost performs best in comparison with the other five algorithms, as the value of R Squared is 0.71 and MSE (Mean Squared Error) is 0.01.

1. Introduction

Organizations strive to enhance employee engagement to improve productivity as they adjust to the new digital era. Productivity helps companies to expand and utilize their human resources. Employees who are engaged are more productive as they are motivated beyond personal factors to perform for the organization (Harter et al. 2013). Employees’ perception of organizational support plays a crucial role in determining how productive they are at work (Bonaiuto et al. 2022; Fleisher et al. 2011; Li et al. 2023). In this era, where machines are capable of handling almost all the possible tasks of humans, they can also predict their performance and productivity (Goumopoulos and Potha 2023). Machine learning is a subfield of artificial intelligence that employs rapid learning from training data and past experiences to predict events automatically without direct programming (Wardhani et al. 2022). The goal of machine learning is to mimic the human intellectual ability to solve complex problems and analyze them based on experience (Sarker 2022). Depending on the type of method, machine learning algorithms can be used to solve different problems in different industries. An appropriate machine learning algorithm aids in forecasting issues by utilizing a series of parameters when the subject matter necessitates prediction and analysis (Obiedat and Toubasi 2022). Utilizing physiological data gathered by wearable sensors, machine learning was used to measure the impact of construction workers’ happiness on production in a variety of human resource domains and areas. Re-weighted instances in the dataset using a Random Forest Machine Learning classifier to ensure that each class received the same total weight—a technique for balancing the classes in the dataset (Santhose and Anisha 2023). Exploring machine learning techniques has proven to be most effective in staff recruitment and job evaluation (Pampouktsi et al. 2023).
The 21st century is witnessing a significant increase in advancements in artificial intelligence. This has resulted in a need for new approaches to utilizing digital technologies in municipal management (Kazakov et al. 2020). In India, urban areas with a population of more than one million are administered by the municipal sector, a local government body. Divisions or departments that are well-organized are how the municipal corporation does its job (Bari and Dey 2022). The services of the municipality include the Housing Board, the Education Department, the Electricity Department, and the Water Supply and Sewage Disposal Undertaking. Millions of people who receive services from each of these departments have been served by municipality staff, who are skilled and knowledgeable. The strength of municipalities’ economy enables the functioning of various cities in many counties (Kokenova et al. 2020). Some of them are the United Kingdom, the United States, the Philippines, India, South Africa, and numerous other nations. These organizations around the world help in providing development, medical services, training, lodging, and transport by gathering local charges and controlling awards from the State government (Madumo 2012). Municipalities can play a key role in the country’s economic development by creating a conducive environment for investment, innovation, and growth. The workers of the municipalities must be productive enough to achieve the desired target for the year. According to Ali and Anwar (2021) and Karthik and Rao (2022), various key factors such as training, leadership, workplace incentives, motivation, rewards, and working conditions affect productivity. Employee productivity is also affected by organizational culture in the municipality sector. Linking rewards to performance and establishing a welcoming environment are two of the effects (Hong and Zainal 2022). This study (Elaho and Odion 2022) generally assumes that working environment, responsibility, and manager support are related to the efficiency of representatives of business centers in the College of Benin Ugbowo grounds, Benin City. Another study (Razali et al. 2023) proposed a machine learning approach for predicting the actual productivity of garment workers in Bangladesh, with a focus on achieving the target production without difficulties. Additionally, this suggests that workplace productivity indicators such as workload and supervisor support are useful. Many municipalities’ primary objectives are to increase employee productivity, particularly among those seeking transparency, accountability, corporate culture in local government management, and improved service to citizens (Ismajli et al. 2015). To boost employee productivity, the company needs to find the quickest and easiest method to predict employee productivity.

1.1. Gaps Covered in the Present Study

One of management’s primary concerns in any service-providing sector is increasing employee output. However, despite its significance, there is a dearth of theoretical and empirical research on employee productivity in the literature.
Additionally, a predictive model will be constructed to assess worker productivity through the application of machine learning algorithms.
Apart from that, it will state a comparison between six different machine learning algorithms on the data set composed of municipality workers of Uttarakhand.
Also, no previous studies have evaluated the incentives of each worker while using 3D Scattered plots by using the Python library.

1.2. The Main Aim of the Study

Various research studies are focused on combining two or more classifiers and uncovering how the integration of various algorithms and techniques can add to the prediction. The three goals of this research are as follows:
To predict the productivity of municipality workers of Uttarakhand through the collected data set with the help of a comparison of six different algorithms.
  • Evaluating the difference between targeted productivity and the actual productivity of all the 12 different departments in the municipality;
  • To evaluate the degree of incentive provided by the department to each of their workers according to the amount of productivity generated by the worker during the year.
The undertaken research is planned into six major sections. In Section 1, the research problem is defined with the gaps in the present scenario. Section 2 includes the literature review of the latest research conducted in the context of analyzing productivity through machine learning. Section 3 includes the research methodology part of the paper explaining the data sampling, data description, pre-processing of data, identification of algorithm, graphs and statistics, confusion metrics, and development of the model. Section 4 contains the results and discussion part. Section 5 provides the discussion, followed by the conclusion in Section 6. Finally, the limitations of this work and the future scope of this research article are described.

2. Theoretical Background

2.1. Employee Productivity

Employee productivity in municipalities is influenced by various factors, including leadership styles, performance evaluation systems, organizational learning, and labor competencies. Implementing an effective performance evaluation system can create a favorable environment for increased productivity (Manu 2015). Organizational learning has been found to have a positive significant relationship with human resource productivity in municipalities (Nkambule 2023). To increase an organization’s overall effectiveness and efficiency, it is crucial to make efficient and effective use of its human resources (Sulistyaningsih 2023). Productivity can also be used to describe a group of workers’ overall performance (Massoudi and Hamdi 2017). Productivity can be broadly defined as the ratio of an input measure to an output measure (Sauermann 2023). Thus, workers’ productivity could be measured as a ratio of an input, such as the number of hours worked or the cost of labor, to an output, such as sales or units produced. In addition to salaries, incentives are the additional direct wages that are received by workers and are proportional to work performance. The output produced in the daily task is used to measure the organization’s or workplace employees’ productivity. The purpose of performance appraisal is to determine the ability and skill of an employee to perform a task, which is objectively and regularly evaluated using benchmarks, whether past- or future-oriented (Clement and Gwaltu 2023). To increase employee productivity, businesses must, therefore, identify their strengths and weaknesses. Performance measurement systems are used in municipalities to assess employee productivity and motivation. These systems help in making critical trade-off decisions and program changes. The returns from performance measurement systems are viewed as worth the costs despite diminishing perceived contributions to individual worker productivity and morale concerns (Anakpo et al. 2023). The motivation of employees in local government is influenced by the process of performance assessment. Factors such as salary, professional advancement, the opportunity for promotion, work conditions, and the objective assessment of performance measurements are important motivators (Ibrahim and Cuadrado 2023). The key performance indicators (KPIs) can be used to assess managerial efficiency and competitiveness at the municipal level. The KPIs can be quantified and form a basis for evaluating the effectiveness and competitiveness of the municipal economy (Multan et al. 2023).
Traditionally, labor productivity is derived from the aggregation of firm-level indicators, such as value added per worker (Sauermann 2023). Workers’ wages can be deciphered as a skewed relation to efficiency, assuming that the example of hard labor is true (Chau et al. 2022). Higher skill levels are often associated with higher wages, but higher wages can also be influenced by higher work efficiency, giving rise to the problem of reverse causality (Sauermann 2023). Both work efficiency and wages have their inadequacies concerning surveying laborers’ efficiency (Sheehan and Garavan 2022). In an idealized society, the productivity of every worker would be observable at every precise instance. However, the output is rarely discernible at an individual level with a reasonable cost, thereby rendering the computation of each person’s productivity fundamentally unachievable (Sauermann 2023). Laborers keep up with similar absolute results by beginning before the day and investing more energy in each meeting to the detriment of spending more hours in the field with a similar complete compensation (LoPalo 2023). The temperature, productivity, and adaptability of workers, as well as evidence from the production of survey data, state that the components that influence efficiency will be proficiency, viability, and quality. According to (Zebua and Chakim 2023), there are two ways to measure productivity: operational productivity and financial productivity. Financial productivity measures use monetary units for inputs, whereas operational productivity is a physical measure of inputs and outputs expressed in physical units.

2.2. Machine Learning Prediction on Productivity

Machine learning models have been used to predict productivity in various industries, including oilfield development (Song et al. 2023), oil and gas extraction (Kim et al. 2023), and construction activities (Juszczyk 2023). These models utilize historical data and various algorithms to estimate productivity and make predictions. (Hassani et al. 2019) This study utilized four different ML calculations to foresee overall equipment effectiveness (OEE), an exhibition measure in assembling. These calculations were support-vector machines, improved support-vector machines (Utilizing Hereditary Calculations), XG Boost, and profound learning. Balla et al. (2021) directed a review that showed good outcomes in foreseeing worker efficiency, quite possibly of their generally significant hierarchical variable. Three characterization calculations were utilized in this review: Neural Network (NN), Random Forest (RF), and Linear Regression (RL). The fact that Random Forest has the lowest correlation coefficient and MAE and RMSE values suggests that it is exceptionally adaptable at predicting employee productivity. The predictions that Safelite Glass Corporation’s shift to piece rates will result in an increase in variance in output across employees and a rise in average productivity are put to the test with a brand-new data set. The output per worker at Safelite increased by 44 percent because of productivity effects. This firm had chosen a sub-par pay framework, as benefits likewise expanded with the change (Mallick et al. 2021). An information examination and AI approach were utilized to foresee the sintering machine’s efficiency. Modern information on sintering machine efficiency was gathered at an incorporated steel plant. A straight relapse and fake brain organization (ANN) model were created to foresee the efficiency of the machine involving the agglomerate constituents as model information sources.
Machine learning algorithms (Random Forest and Logistic regression) mat be used to enable store staff to set product prices and discounts based on consumer behavior (Mahoto et al. 2021). This model gave excellent results in predicting product prices. Data mining tools and decision tree techniques are used to build predictive models and study the effects of ensemble learning and decision tree techniques on improving performance prediction in assembly support systems (Sorostinean et al. 2021). The results show that the slope-enhanced decision tree outperforms all other decision tree-based methods. Machine learning-based strategies are used to identify the fundamental factors that affect productivity (Hui et al. 2023). Four AI approaches were evaluated, with Extra Tree producing the most significant safety factor. Creation list, development pressure, compelling porosity, all-out natural carbon, gas immersion, and shale thickness are believed to be the principal factors influencing shale yield. This study shows a typical match ratio between the planned and actual generation of three new wells, 92.3%, providing insight into the identification of broken wells supplied with water to improve productivity. The prediction of employee performance is crucial for human resource management in identifying the strengths and weaknesses of employees and making strategic decisions (Banu et al. 2020). The use of machine learning algorithms, such as XG Boost, has also been effective in estimating employee performance, especially when dealing with data from HR Information Systems (Kazakov et al. 2020). Machine learning has also been applied in municipal management to forecast key indicators of socio-economic development, demonstrating its potential in predicting productivity in a municipality.

3. Research Methodology

3.1. Data Sampling

This is a predictive model evaluation-based study. The end goal of this study was drawn from the productivity data of workers working in four major municipalities in Uttarakhand, India. Uttarakhand is a developing state of India. Activities such as smart city projects, sewage treatment, sanitation, public facilities, social welfare, infrastructure development, urban planning, sports and leisure facilities, sanitation, and waste disposal are carried out by municipalities within the state for the smooth functioning of the cities. The total population was taken into consideration for further analysis and to calculate the genuine effectiveness and efficiency of the municipality’s workers (Table 1). The data of 1098 respondents were extracted from the sources of four different municipalities of Uttarakhand, India. The total number of workers in the first municipality corporation is 275; the second corporation has 266, the third has 285, and the fourth municipality consists of 272 workers.

3.2. Dataset Description

This dataset description provided in Table 2 incorporates a significantly detailed description of the attribution of the municipality of Uttarakhand. In the dataset, names and values were given to the properties to keep up with the secrecy of the information. Workers working under 12 different departments of four municipalities were undertaken in the data set. Every department has been indicated with the respective number, as explained in Table 3. Following information like department, targeted productivity, SMV, WIP, overtime, incentive, and actual productivity were identified from the provided data set. The data were in a jumbled form. Further, it was segregated department-wise with the respective number of workers working in the municipalities.
Evaluation of the Number of workers in every department:
Table 3 contains all the 12 departments of the municipal corporation, which were taken into consideration for the survey with the number of workers working in the different departments. The data were provided in mixed form, which was further segregated according to the department. Thus, by adding the total number of workers of each department individually, with the help of Microsoft Excel, the total count of workers who are all providing services in the urban areas was finally provided. The highest number of workers are in the education department of the municipality and the lowest number of workers are in the Project Implementation Unit.

3.3. Identification of Algorithms through Lazy Predict Python Library

Lazy predict is an essential Python library for vision rendering projects. It is a simple and efficient language that makes creating advanced presentation projects easy and fast. It is a Python library that provides easy and efficient ways to generate forecasts and is very easy to use and install in the system. Lazy predict is distributed under the MIT license and is an open-source library. Using lagged prediction, 42 different algorithms were identified, of which the first 6 were considered for further analysis based on R Squared, adjusted R Squared, and RMSE (root mean squared error). According to the lazy predict library, among all the favorable algorithms, Gradient Boosting Regressor works the best on the provided data set of municipality as the adjusted R Squared is 0.37 and R Squared is 0.39, as displayed in Table 4.

3.4. Preprocess of Data

The data were studied thoroughly by the authors, for further pre-processing through Jupyter Notebook in Python language. In the preprocessing part of the analysis, the data are organized (numbering each department, segregating the workers department-wise, handling missing values). Firstly, the predominant libraries were imported into the software, and the data were loaded. Each department was assigned a serial number. There were 22 missing values in the data set in the WIP column. It was filled with the missing data techniques, and they all were replaced with their mean value. On the given set of data, correlation analysis was applied to check the correlation between the variables. Further, the data were split into two different sets, i.e., training and testing. The training set contained data numbering 879, and the testing set of data contained 219 as the ideal ratio in machine learning to split the data, which is 80% in training and 20% in testing. With this, a 3D Scatter plot graph was also made to show the highest and lowest received incentives.

3.5. Graphs and Statistics

3.5.1. Targeted and Actual Productivity

Figure 1 shows the total actual and targeted productivity of every department. A clear difference can be seen between targeted productivity and the actual productivity of the department. The data set had already provided the individual workers targeted and actual productivity. Generally, the management of these four municipalities identifies key performance indicators that align with the overall goals and objectives of the municipality. The KPIs are measurable, specific, and relevant to the employee’s role as they help in establishing performance metrics to quantify the output and quality of work (Jin et al. 2023). This includes quantitative measures such as the number of tasks completed, time taken, (SMV) Standard Minute Value, (WIP) work in progress, accuracy, overtime, and efficiency. Data were provided in a jumbled form that was firstly segregated according to the department. Further, with the help of Microsoft Excel, the target productivity and actual productivity of every worker given in the data were added up to evaluate the total actual and targeted productivity of each department. With the help of Figure 1, an analysis can be drawn as to which department has achieved more or less than the targeted work. According to the overall performance, Department of Public Work, Property Tax, Health, Streetlight, and IT have achieved more than the targeted productivity, and the rest have achieved less in comparison to the targeted productivity.

3.5.2. Evaluation of Incentive Based on Productivity

The 3D Scatter plot provided below in Figure 2 depicts the level of incentive provided by the department to each of their workers according to the amount of productivity generated by the worker during the year. The Scatter plot is made using the Python library. According to Figure 2, the targeted productivity of the worker from the IT department of the municipality was 0.6, and its actual productivity was 0.4151724; accordingly, no incentive was received during the year.
Similarly, in Figure 3, the targeted productivity of a worker from the same department was 0.6, and its actual productivity was 0.8643426; accordingly, they received an incentive of INR 2880. From the following Scatter plot, we can easily identify the level of incentive received based on the performance and productivity of a worker.

3.5.3. Correlation Matrix

By using the data, we looked at the correlation between the variables of the study by using the Pearson correlation coefficient (Okpara et al. 2023). The highest correlation between the SMV and the overtime was discovered, with a correlation value of 0.91 displayed in Figure 4.

3.6. Development of the Model

Hist Gradient Boosting Regressor—Histogram-based gradient boosting is one method for training faster decision trees that has been utilized in the gradient boosting ensemble. It is a statistical framework that greatly enhances the technique’s capabilities by permitting the use of arbitrary loss functions and treating the training process as an additive model. Therefore, gradient-boosting ensembles are the preferred method for most structured predictive modeling tasks, such as tabular data. The creation of further set trees can be considerably expedited by discretizing continuous input variables into several hundred discrete values (Brownlee 2021).
LGBM Regressor—LightGBM is an inclination-helping system because of choice trees that work on model productivity while decreasing memory use (Figure 5). It utilizes two creative methodologies: Selective Component Packaging (EFB) and Slope-based One Side Examining (GOSS). These arrangements address the shortcomings of the calculation based on the histogram that forms the basis of all GBDT (Slope Supporting Choice Tree) frameworks. The two strategies of GOSS and EFB explained below serve as a frame for the highlights of the LightGBM Calculation. They work together to ensure that the model functions effectively and that it has an advantage over elective GBDT structures. LightGBM divides the tree into leaves instead of constructing it level by level. The leaf with the greatest delta development catastrophe is selected. When considering a particular leaf, leaf-wise computation is less susceptible compared to level-wise calculation. Building a tree leaf-by-leaf may increase the model’s complexity and result in overfitting when dealing with small datasets.
Gradient Boosting Regressor—A gradient-boosting machine (GBM) is a machine learning algorithm for boosting weak learners or decision trees into stronger learners (Kuhn and Johnson 2013). The regression model also keeps adding a new decision tree to the old model for each iteration to lower error rates and improve performance. For the forecasting model, the GBM would construct a regression model that could calculate employee productivity based on its correlation with other factors (Park et al. 2023). Gradient boosting is a technique for enhancing the quality of machine learning models when their predictability is low. In every learning process, gradient-boosted regression (GBR), an iterative technique, maximizes a model’s predictive power. By handling outliers and missing values, it can help the model become more general. Gradient boosting is a technique used to boost a weak learner, or the difference between predicted and actual target values, to enhance the performance of a predictive model and optimize the loss function. Training a decision tree is how this algorithm works (Algorithm 1). It weighs each tree and categorizes them based on their difficulty. Figure 6 shows Gradient Boost uses an iterative process to combine various weak models into a stronger model while minimizing bias error (Rahman and Nisher 2023). Even when dealing with classification problems, gradient boosting always makes use of regression trees (Johansson n.d.).
Algorithm 1. Algorithm function of Gradient Boosting.
  Input: a differentiable loss function with several iterations M.
  1. Begin the model with a constant value:
  2. For m ranging from 1 to M:
  Calculate the so-called pseudo-residuals:
  • Fit a base learner (or weak learner, such as a tree) that is closed under scaling to pseudo-residuals, i.e., train it with the training set. (Johansson n.d.)
  • Determine the multiplier by solving the one-dimensional optimization problem:
  • Revise the model:
  3. Productivity
The singular learning calculation in this review uses one of the most frequently used DTs, namely, the truck. The following four stages, taken together, represent how the GBRT is carried out:
(1)
Gather and interact with the information, such as by adjusting the info/yield factors and gathering the preparation/testing datasets;
(2)
Train the relapse model with the GBRT using the training dataset;
(3)
Verify the prepared model with the testing dataset;
(4)
Apply the model to real-world problems.
Random Forest Regressor—RFR divides the parameter iteratively using a series of binary splits in Figure 7. Each of these splits is correlated with the value of a specific predictor grid that maximizes the variations in the “tree’s” branches. One decision tree consists of one split and all its related branches. Each branch is made up of a random subset of nodes that stand in for specific predictors. Each predictor node has many potential predictors associated with it, and it is at these nodes that a decision to split the branch further and add two new predictors is made at random. The process is repeated until there are no longer any splits, leaving only terminal nodes or “leaves”. Typically, the RFR will keep performing binary splits until only one predictor is found on a leaf. The overfitting of the prediction may result from this. However, by using a minimum-samples-per-leaf approach, we lessen the chance that our prediction will be overfit. The predictands of the terminal node (or “leaf”), which must number the predetermined minimum samples per leaf, are averaged to produce a single tree prediction value. By averaging the outcomes from every tree in the “forest”, the overall prediction is calculated. RFR cannot predict outside of the range of the direct observations within the original data set because the algorithm can only make predictions based on the data set with which it was initially trained Graw et al. (2021) (refer to Figure 7).
Using the Random Forest regressor to make a prediction, as shown in Figure 7. Every brown node is a distinct predictor with a unique set of predictors. The leaves are represented by the green nodes, and there are a few associated predictors. The leaf node where the minimum samples per leaf threshold is met is where the prediction is made. Each tree’s prediction path is shown by a series of red arrows.
Ada Boost Algorithm—Ada Boost is an ensemble of many weak learner decision trees that outperforms random guessing. The adaptive AdaBoost method, on the other hand, transmits the gradient of previous trees to succeeding trees to lower the error of the prior tree. As a result, the subsequent learning of trees at each step develops a strong learner. The weighted average of the forecasts made by each tree serves as the final forecast. AdaBoost is more resistant to outliers and noisy data due to its high adaptability, which is a crucial requirement in our case. Additionally, the algorithm is designed to function in a way that future trees are fed the knowledge gained by earlier trees (Algorithm 2), allowing them to concentrate only on training samples that are challenging to predict; Freund and Schapire (1997); Patil et al. (2018).
Algorithm 2. Algorithm function of Ada Boost.
  Algorithm
  1. Consider a training set ( x i , y i ), initialize the weights w 1 , 1 w n , 1 to ( 1 / n ) and initialize the number of weak learners h
  2. For g in 1 to G
  i. Compute the error of each learner by using the square loss function
E = L ( f ( x i ) , y i )
  ii. Select the weak learner h g i which minimizes the error.
  iii. Add it to the tree-building algorithm
F g x = F g 1 x + A h g i
where A is the learning rate.
  iv. Update the weights w i , 1 w n , 1 .
  3. F g ( x ) is the final prediction. Freund and Schapire (1997)
F n ( x ) = F m 1 ( x ) + a r g m i n h n i = 1 L ( y i , F m 1 ( x i ) + h ( x i ) )
where F n ( x ) is the overall model, F m 1 ( x ) is the overall obtained in the previous round, yi is the prediction result of the i-th tree, and h ( x i ) is the newly added tree Freund and Schapire (1997) (refer to Figure 8).
XG Boost Regressor—XG Boost is a distributed gradient boosting library designed to be very effective, versatile, and portable. It implements machine learning methods using the Gradient Boosting framework. It provides parallel tree boosting to tackle a wide range of data science issues rapidly and correctly. Gradient-boosted decision trees (GBM) have an extension called XG Boost, which was created specifically to increase speed and effectiveness. When compared to the other benchmarked implementations from R, Python, and Spark, XG Boost was almost always quicker. It was also quicker when compared to the other algorithms. The XG Boost gradient boosting method sequentially ensembles decision trees using a gradient descent optimization algorithm to reduce model error; Chen and Guestrin (2016); Zamani Joharestani et al. (2019); Shwartz-Ziv and Armon (2022) (refer to Figure 9).

4. Results

Evaluation of the Model

Now that a predictive model through a machine learning algorithm was built, the evaluation process of the model techniques MSE (Mean Squared Error) and R Squared Error is used to analyze and evaluate its accuracy.
MSE—In statistics, an estimator’s Mean Squared Error (MSE) or mean squared deviation (MSD) measures the average of the squares of the errors, that is, the average squared difference between the estimated and actual values (Melibaev et al. 2023). The expected value of squared error loss is represented by MSE, a risk function.
M S E = 1 n I = 1 N X i X i ^ 2
  • n = number of data points
  • X i = observed values
  • X i ^ = predicted values
R Squared—The coefficient of determination, abbreviated as R2 or r2 and pronounced “R squared” in statistics, is the fraction of the variation in the dependent variable that is predicted from the independent variable (Piepho 2023).
R 2 = 1 R S S T S S .
  • R 2 = Coefficient of determination;
  • RSS = Sum of squares of residuals;
  • TSS = Total sum of the squares.
With the help of Scikit Learn, we call the library of MSE and R Squared.
This examination focused on accomplishing the most noteworthy precision with R Squared and (MSE) Mean Squared Error for predicting workers’ efficiency. Six algorithms, specifically XG Boost, Hist Gradient Boosting, Ada Boost, LGBM, Random Forest, and Gradient Boosting, are contrasted with one another. The exploration of the study focuses on accomplishing the most noteworthy R Squared with negligible upsides of MSE for anticipating municipality laborers’ efficiency. The R2 and MSE provided insights into both the proportion of variance explained and the magnitude of errors (Hayduk 2006). We have not taken MAE as it is less sensitive to outliers; instead, we used MSE because it does not square the errors as it represents the average absolute difference between predicted and actual values (Hodson 2022). Similarly, we have not taken RMSE to compare the performance of the six different algorithms (Figure 10), as performed by lazy predict earlier in methodology, because the same units of RMSE are the square root of MSE and have the advantage of having the same units as the target variable (Chai and Draxler 2014).
All characterization and relapse calculations, first and foremost, have been applied to the municipal efficiency dataset, and the model through the assessment procedures, which are MSE and R-Square qualities, was recorded (refer to Table 5). The most elevated R Squared esteem is 0.71, and the negligible MSE is worth 0.01 separately in the XG Boost help calculation. The least R Squared esteem is 0.80, and the MSE is worth 0.02 individually in the Ada Boost Regressor. Figure 10 shows an outline and envisioned portrayal of the R Squared and MSE results. After auditing every calculation and their outcome, which is displayed in the diagram, it can be seen that XG Boost performs perfectly well in the dataset with exceedingly significant elements.

5. Discussion

Goumopoulos and Potha (2023) and Boyacı et al. (2023) indicated that machines are capable of handling almost all the possible tasks of human beings and can also predict their performance and productivity. Hassani et al. (2019) and Balla et al. (2021) stated how productivity can be evaluated or predicted through various machine learning algorithms. In a wider view, this paper predicted a model that can be used by various other municipalities with the same set of variables to calculate the productivity of employees or any department (Razali et al. 2023). The current study contributes a predictive machine learning model with a set of variables like Department Number, targeted productivity, Standard Minute Value, work in progress, overtime, amount of financial incentive, and actual productivity. Further, with the help of the lazy predict library, this paper identified the six sets of algorithms among them. It also predicted that Gradient Boosting Regressor works the best. The data provided by the sources were organized well to find out the total number of workers working in each department collectively in four major municipalities. Also, the sum of the targeted and actual productivity of 12 departments was evaluated using Microsoft Excel. Eventually, the difference between total target and actual productivity was also identified. With the help of correlation analysis, the model predicted by the auditing comparison in Table 5 and the calculation and its outcome, it is seen that XG Boost performs perfectly well in the dataset with exceedingly significant elements. Discussing the analysis of the incentive for each worker can be easily identified with the help of a 3-D Scatter plot. Lastly, the correlation matrix is drawn by using a heat map of classifiers that shows the correlation between the factors using the Pearson correlation coefficient (Okpara et al. 2023). The highest correlation can be seen between the SMV and the overtime with a correlation value of 0.91.

6. Conclusions

In the ever-evolving landscape of organizational management, the quest to enhance employee productivity led to the exploration of innovative approaches, with machine learning (ML) emerging as a promising tool. This paper aims to predict the productivity of municipality workers of Uttarakhand state in India. The data set includes targeted productivity, work in progress, Standard Minute Value, overtime, incentive, and actual productivity. According to the lazy predict library, among all the favorable algorithms, Gradient Boosting Regressor works the best on the provided data set of municipality as the adjusted R Squared is 0.37 and the R Squared is 0.39. Further, correlation analysis was applied, where the data were split into training and testing. A 3D Scatter plot and correlation matrix were also drawn. After applying the exploratory factor analysis, a performance evaluation result was drawn which identifies that the XG Boost algorithm works well in the dataset with the values of MSE 0.01 and R Squared 0.71. Other than this, we calculated the total number of workers in 12 different departments of four municipalities. This paper also successfully analyzed the difference between the targeted and actual productivity of 12 different departments of the municipality in Uttarakhand. Results depict that the Department of Public Work, Property Tax, Health, Streetlight, and IT achieved more than the targeted productivity, and the rest achieved less in comparison to the targeted productivity. Municipalities or any other sector can use this model to predict productivity and easily find out the most productive employee and their incentive with the help of a 3D Scatter plot.

7. Future Implications and Limitations of the Study

The limitation of the present study can be added to future research. Firstly, many factors such as salary, presentism, absenteeism, and rewards that were not taken into consideration can be used for further research in municipalities or any other service sector worldwide. Secondly, a similar kind of study can be conducted in other countries as well, using the same model in various industries, companies, or any public or private organization for evaluating performance- or productivity-enhancing factors. Secondly, since this study focused on a small sample size of only four municipalities, scholars can take up the rest of the municipalities in the Uttarakhand state of India for the analysis of the overall productivity of each department to carry forward and add significant findings to the present study. Lastly, this study is limited to the analysis of data through machine learning algorithms; so, for more advanced research, one can analyze their data set by using deep learning algorithm technology.

Author Contributions

Conceptualization, P.B., A.G., A.M., A.J. and M.A.; Methodology, P.B., A.G., A.M., A.J. and M.A.; Formal analysis, P.B., A.G., A.M., A.J. and M.A.; Investigation, P.B., A.G., A.M. and M.A.; Writing—original draft, P.B., A.G., A.M., A.J. and M.A.; Writing—review & editing, P.B., A.G., A.M., A.J. and M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ali, Bayad Jamal, and Govand Anwar. 2021. An empirical study of employees’ motivation and its influence job satisfaction. International Journal of Engineering, Business and Management 5: 21–30. [Google Scholar]
  2. Anakpo, Godfred, Zanele Nqwayibana, and Syden Mishi. 2023. The Impact of Work-from-Home on Employee Performance and Productivity: A Systematic Review. Sustainability 15: 4529. [Google Scholar] [CrossRef]
  3. Balla, Imanuel, Sri Rahayu, and Jajang Jaya Purnama. 2021. Garment employee productivity prediction using random forest. Jurnal Techno Nusa Mandiri 18: 49–54. [Google Scholar] [CrossRef]
  4. Banu, Sohara, Nipun Agarwal, Akhil Singh, Sobiya Shaik, and P. Sai Nikitha. 2020. Machinelearningalgorithm to predict and improve efficiency of employee performance in organizations. International Journal of Advanced Research in Computer Science 11: 6. [Google Scholar]
  5. Bari, M. Ehteshamul, and Pritam Dey. 2022. Local Governance in India During a Pandemic: A Case for Granting Agency to Municipal Governments. In International Handbook of Disaster Research. Singapore: Springer Nature, pp. 1–19. [Google Scholar]
  6. Bonaiuto, Flavia, Stefania Fantinelli, Alessandro Milani, Michela Cortini, Marco Cristian Vitiello, and Marino Bonaiuto. 2022. Perceived organizational support and work engagement: The role of psychosocial variables. Journal of Workplace Learning 34: 418–36. [Google Scholar]
  7. Boyacı, Tamer, Caner Canyakmaz, and de Francis Véricourt. 2023. Human and machine: The impact of machine input on decision making under cognitive limitations. Management Science, ahead of print. [Google Scholar]
  8. Brownlee, Jason. 2021. Histogram-Based Gradient Boosting Ensembles in Python. Available online: machinelearningmastery.com (accessed on 27 April 2021).
  9. Chau, Nancy, Ravi Kanbur, and Vidhya Soundararajan. 2022. Employer Power and Employment in Developing Countries. The SC Johnson College of Business Applied Economics and Policy Working Paper Series, (2022-07); Rochester: SSRN. [Google Scholar]
  10. Chen, Tianqi, and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. Paper presented at the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17; pp. 785–94. [Google Scholar]
  11. Clement, Bernadetha, and Paul Martin Gwaltu. 2023. Assessing Factors influencing Employees Work Productivity in Tanzania’s Local Government. International Journal of Engineering, Business and Management 7: 12–19. [Google Scholar] [CrossRef]
  12. Elaho, Omoruyi Bernard, and Amuen Samson Odion. 2022. The Impact of Work Environment on Employee Productivity: A Case Study of Business Centers in University of Benin Complex. Amity Journal of Management Research 5: 782–97. [Google Scholar]
  13. Feng, De Cheng, and Bo Fu. 2020. Shear strength of internal reinforced concrete beam-column joints: Intelligent modeling approach and sensitivity analysis. Advances in Civil Engineering 2020: 1–19. [Google Scholar] [CrossRef]
  14. Fleisher, Belton M., Yifan Hu, Hu Li, and Seonghoon Kim. 2011. Economic transition, higher education, and worker productivity in China. Journal of Development Economics 94: 86–94. [Google Scholar] [CrossRef]
  15. Freund, Yoav, and Robert E. Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer And system Sciences 55: 119–39. [Google Scholar] [CrossRef]
  16. Goumopoulos, Christos, and Nektaria Potha. 2023. Mental fatigue detection using a wearable commodity device and machine learning. Journal of Ambient Intelligence and Humanized Computing 14: 10103–21. [Google Scholar] [CrossRef]
  17. Graw, J. H., W. T. Wood, and B. J. Phrampus. 2021. Predicting global marine sediment density using the random forest regressor machine learning algorithm. Journal of Geophysical Research: Solid Earth 126: e2020JB020135. [Google Scholar]
  18. Harter, James K., Frank L. Schmidt, Sangeeta Agrawal, Anthony Blue, Stephanie K. Plowman, Patrick Josh, and Jim Asplund. 2013. The Relationship between Engagement at Work and Organizational Outcomes. Washington, DC: Gallup Poll Consulting University Press. [Google Scholar]
  19. Hassani, Ibtissam E., Choumicha E. I. Mazgualdi, and Tawfik Masrour. 2019. Artificial intelligence and machine learning to predict and improve efficiency in manufacturing industry. arXiv arXiv:1901.02256. [Google Scholar]
  20. Hayduk, Leslie A. 2006. Blocked-error-R 2: A conceptually improved definition of the proportion of explained variance in models containing loops or correlated residuals. Quality and Quantity 40: 629–49. [Google Scholar] [CrossRef]
  21. Hodson, Timothy O. 2022. Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development 15: 5481–87. [Google Scholar]
  22. Hong, Liang, and Siti Rohaida Mohamed Zainal. 2022. The Mediating Role of Organizational Culture (OC) on the Relationship between Organizational Citizenship Behavior (OCB) and Innovative Work Behavior (IWB) to Employee Performance (EP) in Education Sector of Malaysia. Global Business & Management Research 14: 1022–43. [Google Scholar]
  23. Hui, Gang, Zhangxin Chen, Youjing Wang, Dongmei Zhang, and Fei Gu. 2023. An integrated machine learning-based approach to identifying controlling factors of unconventional shale productivity. Energy 266: 126512. [Google Scholar]
  24. Ibrahim, Muna, and Esther Cuadrado. 2023. The Impact of Corporate Culture on Employee Performance: A Scoping Review. Migration Letters 20: 1267–84. [Google Scholar]
  25. Ismajli, Naim, Jusuf Zekiri, Ermira Qosja, and Ibrahim Krasniqi. 2015. The importance of motivation factors on employee performance in Kosovo municipalities. Journal of Political Sciences & Public Affairs 3: 2–6. [Google Scholar]
  26. Jin, Yue, Makram Bouzid, Armen Aghasaryan, and Ricardo Rocha. 2023. Community Selection for Multivariate KPI Predictions in a 2-Tier System. In NOMS 2023–2023 IEEE/IFIP Network Operations and Management Symposium. Piscataway: IEEE, pp. 1–5. [Google Scholar]
  27. Johansson, Richard. n.d. An Intuitive Explanation of Gradient Boosting. Available online: https://www.cse.chalmers.se/~richajo/dit866/files/gb_explainer.pdf (accessed on 1 January 2024).
  28. Juszczyk, Michal. 2023. Construction Work Efficiency Analysis—Application of Probabilistic Approach and Machine Learning for Formworks Assembly. Applied Sciences 13: 5780. [Google Scholar] [CrossRef]
  29. Karthik, Dasari, and C. B. Kameswara Rao. 2022. Identifying the significant factors affecting the masonry labour productivity in building construction projects in India. International Journal of Construction Management 22: 464–72. [Google Scholar] [CrossRef]
  30. Kazakov, Oleg D., Natalya A. Kulagina, and Natalya Y. Azarenko. 2020. Machine Learning Methods in Municipal Formation. In Growth Poles of the Global Economy: Emergence, Changes and Future Perspectives. Cham: Springer, pp. 339–46. [Google Scholar]
  31. Kim, Sungil, Kwang Hyun Kim, and Jung Tek Lim. 2023. Synergistic enhancement of productivity prediction using machine learning and integrated data from six shale basins of the USA. Geoenergy Science and Engineering 229: 212068. [Google Scholar]
  32. Kokenova, A. T., T. N. Mashirova, K. K. Mamutova, R. A. Kaukeshova, and B. N. Sabenova. 2020. Adaptation of models of foreign countries in the management of municipal development and its resource provision. Научный журнал «Дoклады НАН РК» 3: 175–84. [Google Scholar] [CrossRef]
  33. Kuhn, Max, and Kjell Johnson. 2013. Applied Predictive Modeling. Berlin and Heidelberg: Springer, vol. 26, pp. 203–6. [Google Scholar]
  34. Li, Daihong, Zhili Tang, Qian Kang, Xiaoyu Zhang, and Youhua Li. 2023. Machine Learning-Based Method for Predicting Compressive Strength of Concrete. Processes 11: 390. [Google Scholar] [CrossRef]
  35. LoPalo, Melissa. 2023. Temperature, Worker Productivity, and Adaptation: Evidence from Survey Data Production. American Economic Journal: Applied Economics 15: 192–229. [Google Scholar] [CrossRef]
  36. Madumo, Onkgopotse Senatla. 2012. Contexualising leadership challenges in municipalities: A developmental impression. African Journal of Public Affairs 5: 82–92. [Google Scholar]
  37. Mahoto, Naeem, Rabia Iftikhar, Asadullah Shaikh, Yousef Asiri, Abdullah Alghamdi, and Khairan Rajab. 2021. An Intelligent Business Model for Product Price Prediction Using Machine Learning Approach. Intelligent Automation & Soft Computing 30: 147–59. [Google Scholar]
  38. Mallick, Arpit, Subhra Dhara, and Sushant Rath. 2021. Application of machine learning algorithms for prediction of sinter machine productivity. Machine Learning with Applications 6: 100186. [Google Scholar] [CrossRef]
  39. Mandot, Pushkar. 2017. What Is LightGBM, How to Implement It? How to Fine Tune the Parameters? San Francisco: Medium. [Google Scholar]
  40. Manu, Christian Addai. 2015. The Effects of Work Environment on Employees Productivity in Government Organizations. A Case Study of Obuasi Municipal Assembly. Ph.D. dissertation, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana. [Google Scholar]
  41. Massoudi, Aram Hanna, and Samir Salah Aldin Hamdi. 2017. The Consequence of work environment on Employees Productivity. IOSR Journal of Business and Management 19: 35–42. [Google Scholar]
  42. Melibaev, Makhmudjon, Sadiqjon Negmutullaev, Makhliyo Jumaeva, and Saydullahon Akbarov. 2023. Point estimation of the true value and mean square deviation of the measurement. Science and Innovation 2: 179–86. [Google Scholar]
  43. Multan, Ewa, Marzena Wójcik-Augustyniak, Bartosz Sobotka, and Jakub Bis. 2023. Application of Performance and Efficiency Indicators in Measuring the Level of Success of Public Universities in Poland. Sustainability 15: 13673. [Google Scholar] [CrossRef]
  44. Nadeem. 2021. Introduction to XGBoost Algorithm. San Francisco: Medium. [Google Scholar]
  45. Nkambule, Bongani Innocent. 2023. Organisational Learning and Knowledge Sharing Culture in Township Schools: An Exploration of Effective and Ineffective Practices. Jurnal Penelitian dan Pengkajian Ilmu Pendidikan: E-Saintika 7: 60–74. [Google Scholar] [CrossRef]
  46. Obiedat, Ruba, and Sara Amjad Toubasi. 2022. A Combined Approach for Predicting Employees’ Productivity based on Ensemble Machine Learning Methods. Informatica 46: 49–58. [Google Scholar]
  47. Okpara, Chinedu R., Victor E. Idigo, and Chukwunenye S. Okafor. 2023. Comparative analysis of the features of a 5G network production dataset: The machine learning approach. European Journal of Engineering and Technology Research 8: 52–57. [Google Scholar]
  48. Pampouktsi, Panagiota, Spyridon Avdimiotis, Manolis Maragoudakis, Markos Avlonitis, Nikita Samantha, Praveen Hoogar, George Mugambage Ruhago, and Wcyliffe Rono. 2023. Techniques of Applied Machine Learning Being Utilized for the Purpose of Selecting and Placing Human Resources within the Public Sector. Journal of Information System Exploration and Research 1: 1–16. [Google Scholar] [CrossRef]
  49. Park, Soyoung, Solyong Jung, Jaegul Lee, and Jin Hur. 2023. A Short-Term Forecasting of Wind Power Outputs Based on Gradient Boosting Regression Tree Algorithms. Energies 16: 1132. [Google Scholar] [CrossRef]
  50. Patil, Sangram, Aum Patil, and Vikas M. Phalle. 2018. Life prediction of bearing by using adaboost regressor. Paper presented at the TRIBOINDIA-2018 an International Conference on Tribology, Mumbai, India, December 15. [Google Scholar]
  51. Piepho, Hans Peter. 2023. An adjusted coefficient of determination (R2) for generalized linear mixed models in one go. Biometrical Journal 65: 2200290. [Google Scholar] [CrossRef]
  52. Rahman, Md. Mostafizur, and Sumiya Akter Nisher. 2023. Predicting Average Localization Error of Underwater Wireless Sensors via Decision Tree Regression and Gradient Boosted Regression. Paper presented at the International Conference on Information and Communication Technology for Development: ICICTD 2022, Khulna, Bangladesh, July 29–30; Singapore: Springer Nature, pp. 29–41. [Google Scholar]
  53. Razali, Mohd Norhisham, Norizuandi Ibrahim, Rozita Hanapi, Norfarahzila Mohd Zamri, and Syaifulnizam Abdul Manaf. 2023. Exploring Employee Working Productivity: Initial Insights from Machine Learning Predictive Analytics and Visualization. Journal of Computing Research and Innovation 8: 235–45. [Google Scholar]
  54. Santhose, Samuel Sam, and Baby Anisha. 2023. Psychological improvement in Employee Productivity by Maintaining Attendance System using Machine Learning Behavior. Journal of Community Psychology 51: 270–83. [Google Scholar] [CrossRef]
  55. Sarker, Iqbal H. 2022. Machine learning for intelligent data analysis and automation in cybersecurity: Current and future prospects. Annals of Data Science 10: 1473–98. [Google Scholar] [CrossRef]
  56. Sauermann, Jan. 2023. Performance Measures and Worker Productivity. Bonn: IZA World of Labor. [Google Scholar]
  57. Sheehan, Maura, and Thomas Garavan. 2022. High-performance work practices and labour productivity: A six-wave longitudinal study of UK manufacturing and service SMEs. The InTernaTIonal Journal of Human Resource Management 33: 3353–86. [Google Scholar] [CrossRef]
  58. Shwartz-Ziv, Ravid, and Amitai Armon. 2022. Tabular data: Deep learning is not all you need. Information Fusion 81: 84–90. [Google Scholar] [CrossRef]
  59. Song, Laiming, Chunqiu Wang, Chuan Lu, Shuo Yang, Chaodong Tan, and Xiongying Zhang. 2023. Machine Learning Model of Oilfield Productivity Prediction and Performance Evaluation. Journal of Physics: Conference Series 2468: 012084. [Google Scholar]
  60. Sorostinean, Radu, Arpad Gellert, and Bogdan Constantin Pirvu. 2021. Assembly Assistance System with Decision Trees and Ensemble Learning. Sensors 21: 3580. [Google Scholar] [CrossRef]
  61. Sulistyaningsih, Elli. 2023. Improving Human Resources Technology Innovation as a Business Growth Driver in the Society 5.0 Era. ADI Journal on Recent Innovation 4: 149–59. [Google Scholar] [CrossRef]
  62. Chai, Tianfeng, and R. Roland Draxler. 2014. Root mean square error (RMSE) or mean absolute error (MAE). Geoscientific Model Development Discussions 7: 1525–34. [Google Scholar] [CrossRef]
  63. Wardhani, Rulyanti Susi, Kamal Kant, Anusha Sreeram, Monica Gupta, Erwandy Erwandy, and Pranjal Kumar Bora. 2022. Impact of Machine Learning on the Productivity of Employees in Workplace. Paper presented at the 2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, September 21; New York: IEEE, pp. 930–34. [Google Scholar]
  64. Zamani Joharestani, Mehdi, Chunxiang Cao, Xiliang Ni, Barjeece Bashir, and Somayeh Talebiesfandarani. 2019. PM2.5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere 10: 373. [Google Scholar] [CrossRef]
  65. Zebua, Selamat, and Mochamad Heru Riza Chakim. 2023. Effect of Human Resources Quality, Performance Evaluation, and Incentives on Employee Productivity at Raharja High School. APTISI Transactions on Management (ATM) 7: 1–7. [Google Scholar]
Figure 1. The graphical representation of actual productivity achieved in comparison to the targeted productivity.
Figure 1. The graphical representation of actual productivity achieved in comparison to the targeted productivity.
Economies 12 00016 g001
Figure 2. The lowest incentive received by an employee from department 5.
Figure 2. The lowest incentive received by an employee from department 5.
Economies 12 00016 g002
Figure 3. The highest incentive received by employees from department 5.
Figure 3. The highest incentive received by employees from department 5.
Economies 12 00016 g003
Figure 4. Correlation matrix using heat map of classifiers.
Figure 4. Correlation matrix using heat map of classifiers.
Economies 12 00016 g004
Figure 5. The architecture of LightBGM. Source: (Mandot 2017; Hui et al. 2023).
Figure 5. The architecture of LightBGM. Source: (Mandot 2017; Hui et al. 2023).
Economies 12 00016 g005
Figure 6. The procedure of gradient boosting. Source—Feng and Fu (2020).
Figure 6. The procedure of gradient boosting. Source—Feng and Fu (2020).
Economies 12 00016 g006
Figure 7. Procedures for making a prediction using Random Forest regressor. Source—Graw et al. 2021.
Figure 7. Procedures for making a prediction using Random Forest regressor. Source—Graw et al. 2021.
Economies 12 00016 g007
Figure 8. AdaBoost algorithm calculation process.
Figure 8. AdaBoost algorithm calculation process.
Economies 12 00016 g008
Figure 9. The architecture of XG Boost. Source: Nadeem (2021).
Figure 9. The architecture of XG Boost. Source: Nadeem (2021).
Economies 12 00016 g009
Figure 10. Performance model comparison of six different algorithms.
Figure 10. Performance model comparison of six different algorithms.
Economies 12 00016 g010
Table 1. Flow chart of research methodology.
Table 1. Flow chart of research methodology.
No. of StepsDescription of Research Methodology
1A total population of 1098 was extracted from the 4 different municipalities.
2Six algorithms were identified through lazy prediction and applied to the data set.
3The data were studied thoroughly by the authors for further pre-processing through Jupyter Notebook in Python language. In the preprocessing part, the data are organized (numbering each department, segregating the workers department-wise, handling missing values).
4Further correlation analysis was performed to find out the correlation between the variables and to clearly understand the data in and out.
5Fourthly, the data were split up into two parts: training and testing. Training consisted of 879, and testing contained 219.
6The model was trained, and correlation analysis was applied to predict the required results.
7Lastly, with the help of the results and the evaluation process of the model techniques like MSE and R Squared, a predictive model was developed.
Table 2. Data description.
Table 2. Data description.
S. No.AttributionDescription
1Department NumberIt ranges from 1–12.
2Targeted productivityProductivity targets are set by the department for each team for each quarter.
3SMVStandard Minute Value is the allocated time for a task.
4WIPWork in progress. Includes the number of unfinished works for each department.
5OvertimeRepresents the amount of overtime by each team in minutes.
6IncentiveRepresents the level of financial incentive that enables or motivates a particular course of action.
7Actual productivity: Percentage of actual productivity provided by workers. It varies from 0 to 1.
Table 3. The number of workers in each department of the municipality.
Table 3. The number of workers in each department of the municipality.
Department No.Department NameNo. of Workers
(Data from Each Dept)
1.Public work department90
2.Property tax department99
3.Health department88
4.Street light department94
5.IT department85
6.Sanitation department91
7.Birth/Death Certificate department86
8.Education department101
9.Disaster management96
10.Election department93
11.Project Implementation Unit (PIU) CELL80
12.Postal mail department95
Table 4. Results of identified algorithms.
Table 4. Results of identified algorithms.
S. No.ModelAdjusted R SquaredR SquaredRMSETime Taken
1Gradient Boosting Regressor0.370.390.140.09
2LGBM Regressor 0.330.350.150.07
3Hist Gradient Boosting Regressor0.330.350.150.48
4Random Forest Regressor0.20.220.160.31
5Ada-boost Algorithm0.170.20.160.05
6Xg-Boost Regressor0.110.140.170.1
Table 5. Performance evaluation of model.
Table 5. Performance evaluation of model.
ModelR SquaredMSE (Mean Squared Error)
XG Boost0.710.01
LGBM0.630.01
Hist Gradient Boosting Regressor0.540.01
Gradient Boosting Regressor0.250.01
Random Forest Regressor−0.550.02
Ada Boost Regressor−0.800.02
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bijalwan, P.; Gupta, A.; Mendiratta, A.; Johri, A.; Asif, M. Predicting the Productivity of Municipality Workers: A Comparison of Six Machine Learning Algorithms. Economies 2024, 12, 16. https://doi.org/10.3390/economies12010016

AMA Style

Bijalwan P, Gupta A, Mendiratta A, Johri A, Asif M. Predicting the Productivity of Municipality Workers: A Comparison of Six Machine Learning Algorithms. Economies. 2024; 12(1):16. https://doi.org/10.3390/economies12010016

Chicago/Turabian Style

Bijalwan, Priya, Ashulekha Gupta, Anubhav Mendiratta, Amar Johri, and Mohammad Asif. 2024. "Predicting the Productivity of Municipality Workers: A Comparison of Six Machine Learning Algorithms" Economies 12, no. 1: 16. https://doi.org/10.3390/economies12010016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop