Building Data Literacy for Sustainable Development: A Framework for Effective Training
Abstract
1. Introduction
1.1. Motivation
1.2. Background
1.3. Research Questions and Objectives
- Understand and use Python data science libraries.
- Create Python codes for the statistical analysis.
- Create visualisation graphics using Python libraries.
- Do personal characteristics affect employees’ performance in data analytics training?
- Do Professional characteristics affect employees’ performance in data analytics training?
- Do prior knowledge skills affect employees’ performance in the data analytics training?
- To visualise the distributional behaviour of responses.
- To assess visual objects with respect to questions.
- To explore associations between variables.
- To test and interpret the associations.
- To highlight new directions for data visualisation.
2. Methods
2.1. Data Sources
2.2. Modelling Strategy
2.2.1. Observable and Expected Values Deviations
2.2.2. Correspondence Analysis
3. Findings and Analyses
3.1. Visualisation of Variables
3.2. Assessing Variable Independence
3.3. Correspondence Analyses
4. Discussion
4.1. The Importance of Technical Communicative Skills in the Workplace
- Conducting needs analysis assessment to understand the specific linguistic needs and weaknesses among employees. This could be performed through initial tests in the recruitment process or prior to starting their jobs. Furthermore, organisations can incorporate AI-assisted training tools to determine the employees’ level and identify their areas of improvement. The outcomes of such an assessment could be used to inform people involved in creating training content, to design discipline-specific training programs relevant to Big Data Analytics.
- Evaluating and updating training programs regularly to meet the rapidly evolving technological advancement and requirements for Big Data Analytics.
- Recruiting experts’ facilitators and trainers in technical communication. The facilitators should be able to tailor the content of their materials to the Big Data field and its associated practices.
- Designing generic and inclusive training materials to support employees. These materials should be specifically focused on the technical linguistic skills needed to support employees in performing their jobs successfully.
- Providing employees with regular training in technical communication. These could include multiple levels from basic to advanced ones, aiming at meeting the needs of the diverse communities, particularly in organisations where employees have different linguistic and cultural backgrounds.
4.2. Legal and Ethical Considerations
5. Research Contribution and Implications
6. Concluding Remarks
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mwitondi, K.; Mak, H.W.L. Robust Machine Learning Algorithmic Rules for Detecting Air Pollution in the Lower Parts of the Atmosphere. Data Sci. J. 2025, 24, 27. [Google Scholar] [CrossRef]
- Mwitondi, K.S.; Said, R.A. Dealing with Randomness and Concept Drift in Large Datasets. Data 2021, 6, 77. [Google Scholar] [CrossRef]
- Mwitondi, K.; Munyakazi, I.; Gatsheni, B. A robust machine learning approach to SDG data segmentation. J. Big Data 2020, 7, 97. [Google Scholar] [CrossRef]
- Buneman, P.; Jajodia, S. (Eds.) SIGMOD ’93: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data; Association for Computing Machinery: New York, NY, USA, 1993. [Google Scholar]
- Agrawal, R.; Imieliński, T.; Swami, A. Mining Association Rules Between Sets of Items in Large Databases. SIGMOD Rec. 1993, 22, 207–216. [Google Scholar] [CrossRef]
- Pearson, K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1900, 50, 157–175. [Google Scholar] [CrossRef]
- Hirschfeld, H.O. A Connection between Correlation and Contingency. Math. Proc. Camb. Philos. Soc. 1935, 31, 520–524. [Google Scholar] [CrossRef]
- Bendre, S.M. Masking and swamping effects on tests for multiple outliers in normal sample. Commun. Stat.—Theory Methods 1989, 18, 697–710. [Google Scholar] [CrossRef]
- Lawrence, A.J. Deletion Influence and Masking in Regression. J. R. Stat. Soc. Ser. B (Methodol.) 1995, 57, 181–189. [Google Scholar] [CrossRef]
- Van Dijk, J. The Digital Divide; John Wiley & Sons: Hoboken, NJ, USA, 2020. [Google Scholar]
- Henry, L. Bridging the Urban-Rural Digital Divide and Mobilizing Technology for Poverty Eradication: Challenges and Gaps. GSM Assoc. 2019. Available online: https://www.un.org/development/desa/dspd/wp-content/uploads/sites/22/2019/03/Henry-Bridging-the-Digital-Divide-2019.pdf (accessed on 27 September 2025).
- Restuccia, D.; Taska, B. Different skills, different gaps: Measuring and closing the skills gap. In Developing Skills in a Changing World of Work; Rainer Hampp Verlag: Baden-Baden, Germany, 2018; pp. 207–226. [Google Scholar]
- SDG. Sustainable Development Goals. 2015. Available online: https://www.un.org/sustainabledevelopment/sustainable-development-goals/ (accessed on 15 September 2025).
- SDGI. Sustainable Development Goals Indicators. 2017. Available online: https://unstats.un.org/sdgs/indicators/database/ (accessed on 15 September 2025).
- Yadav, P.; Tudela, L.A.M.; Marco-Lajara, B. The role of AI in assessing and achieving the sustainable development goals (SDGs). In Issues of Sustainability in AI and New-Age Thematic Investing; IGI Global Scientific Publishing: Palmdale, PA, USA, 2024; pp. 1–17. [Google Scholar]
- Monino, J.L. Data value, big data analytics, and decision-making. J. Knowl. Econ. 2021, 12, 256–267. [Google Scholar] [CrossRef]
- Li, L.; Lin, J.; Ouyang, Y.; Luo, X.R. Evaluating the impact of big data analytics usage on the decision-making quality of organizations. Technol. Forecast. Soc. Change 2022, 175, 121355. [Google Scholar] [CrossRef]
- Shi, Y. Advances in big data analytics. Adv. Big Data Anal. 2022, 10, 978–981. [Google Scholar]
- Prakash, D. Data-driven management: The impact of big data analytics on organizational performance. Int. J. Glob. Acad. Sci. Res. 2024, 3, 12–23. [Google Scholar] [CrossRef]
- Franke, F.; Hiebl, M.R. Big data and decision quality: The role of management accountants’ data analytics skills. Int. J. Account. Inf. Manag. 2023, 31, 93–127. [Google Scholar] [CrossRef]
- Johnson, M.; Jain, R.; Brennan-Tonetta, P.; Swartz, E.; Silver, D.; Paolini, J.; Mamonov, S.; Hill, C. Impact of big data and artificial intelligence on industry: Developing a workforce roadmap for a data driven economy. Glob. J. Flex. Syst. Manag. 2021, 22, 197–217. [Google Scholar] [CrossRef]
- Xu, L.; Zhang, J.; Ding, Y.; Sun, G.; Zhang, W.; Philbin, S.P.; Guo, B.H. Assessing the impact of digital education and the role of the big data analytics course to enhance the skills and employability of engineering students. Front. Psychol. 2022, 13, 974574. [Google Scholar] [CrossRef]
- Navlani, A.; Fandango, A.; Idris, I. Python Data Analysis: Perform Data Collection, Data Processing, Wrangling, Visualization, and Model Building Using Python; Packt Publishing Ltd.: Birmingham, UK, 2021. [Google Scholar]
- Congedo, L. Semi-Automatic Classification Plugin: A Python tool for the download and processing of remote sensing images in QGIS. J. Open Source Softw. 2021, 6, 3172. [Google Scholar] [CrossRef]
- The-R-Foundation. The R Project for Statistical Computing. 2022. Available online: https://www.r-project.org/ (accessed on 4 December 2024).
- Pearson, K. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
- Cochran, W.G. The χ2 Test of Goodness of Fit. Ann. Math. Stat. 1952, 23, 315–345. [Google Scholar] [CrossRef]
- Cohen, A. On the graphical display of the significant components in a two-way contingency table. Commun. Stat.—Theory Methods 1980, A9, 1025–1041. [Google Scholar] [CrossRef]
- Zhang, R.; Jayawardene, V.; Indulska, M.; Sadiq, S.; Zhou, X. A Data Driven Approach for Discovering Data Quality Requirements. In Proceedings of the ICIS—Decision Analytics, Big Data and Visualisation, Auckland, New Zealand, 14–17 December 2014. [Google Scholar]
- Zhang, P.; Xiong, F.; Gao, J.; Wang, J. Data quality in big data processing: Issues, solutions and open problems. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted Computed, Scalable Computing Communications, Cloud Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), San Francisco, CA, USA, 4–8 August 2017; pp. 1–7. [Google Scholar] [CrossRef]
- Bringula, R.; Reguyal, J.J.; Tan, D.D.; Ulfa, S. Mathematics self-concept and challenges of learners in an online learning environment during COVID-19 pandemic. Smart Learn. Environ. 2021, 8, 22. [Google Scholar] [CrossRef]
- Alexandron, G.; Ruipérez-Valiente, J.A.; Lee, S.; Pritchard, D. Evaluating the Robustness of Learning Analytics Results Against Fake Learners. In European Conference on Technology Enhanced Learning; Springer International Publishing: Cham, Switzerland, 2018. [Google Scholar]
- Lee, D.M.; Pliskin, N.; Kahn, B. The relationship between performance in a computer literacy course and students’ prior achievement and knowledge. J. Educ. Comput. Res. 1994, 10, 63–77. [Google Scholar] [CrossRef]
- Li, G.; Yuan, C.; Kamarthi, S.; Moghaddam, M.; Jin, X. Data science skills and domain knowledge requirements in the manufacturing industry: A gap analysis. J. Manuf. Syst. 2021, 60, 692–706. [Google Scholar] [CrossRef]
- Tambe, P.B. Reskilling the Workforce for AI: Domain Expertise and Algorithmic Literacy. Manag. Sci. 2025. [Google Scholar] [CrossRef]
- Falk, I.; Millar, P. Literacy and n Eracy in Vocational Education and Training: Review of Research; National Centre for Vocational Education Research: Adelaide, Australia, 2001. [Google Scholar]
- Bengio, Y.; Hinton, G.; Yao, A.; Song, D.; Abbeel, P.; Darrell, T.; Harari, Y.N.; Zhang, Y.Q.; Xue, L.; Shalev-Shwartz, S.; et al. Managing extreme AI risks amid rapid progress. Science 2024, 384, 842–845. [Google Scholar] [CrossRef]
- Schuett, J. Three lines of defense against risks from AI. AI Soc. 2025, 40, 493–507. [Google Scholar] [CrossRef]
- Nikou, S.; De Reuver, M.; Mahboob Kanafi, M. Workplace literacy skills—How information and digital literacy affect adoption of digital technology. J. Doc. 2022, 78, 371–391. [Google Scholar] [CrossRef]
- Balusamy, B.; Kadry, S.; Gandomi, A.H. Big Data: Concepts, Technology, and Architecture; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
- Maor, O. Bridging legal methodology and ethical considerations: A Novel Approach Applied to challenges of Data Harvesting. Digit. Soc. 2025, 4, 1. [Google Scholar] [CrossRef]
- Mallet, P. Comparative Analysis of Data Privacy Legislation: Convergence and Divergence Between the GDPR and CCPA. In Tech Fusion in Business and Society; Springer: Berlin/Heidelberg, Germany, 2025; pp. 465–475. [Google Scholar]
- Huang, M.L. Digital Privacy in the Age of Surveillance: A Comparative Study of GDPR and CCPA. OTS Can. J. 2025, 4, 65–74. [Google Scholar] [CrossRef]
- Nicolás-Agustín, Á.; Jiménez-Jiménez, D.; Maeso Fernandez, F.; Di Prima, C. ICT training, digital transformation and company performance: An empirical study. Eur. J. Innov. Manag. 2025, 28, 1687–1708. [Google Scholar] [CrossRef]
- Koshiyama, A.; Kazim, E.; Treleaven, P.; Rai, P.; Szpruch, L.; Pavey, G.; Ahamat, G.; Leutner, F.; Goebel, R.; Knight, A.; et al. Towards algorithm auditing: Managing legal, ethical and technological risks of AI, ML and associated algorithms. R. Soc. Open Sci. 2024, 11, 230859. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, A.; Ngo, H.N.; Hong, Y.; Dang, B.; Nguyen, B.P.T. Ethical principles for artificial intelligence in education. Educ. Inf. Technol. 2023, 28, 4221–4241. [Google Scholar] [CrossRef] [PubMed]
- Jeske, D.; Calvard, T. Big data: Lessons for employers and employees. Empl. Relations Int. J. 2020, 42, 248–261. [Google Scholar] [CrossRef]
- Griffin, R.W.; Phillips, J.M.; Gully, S.M.; Creed, A.; Gribble, L.; Watson, M. Organisational Behaviour: Engaging People and Organisations; Cengage AU: South Melbourne, Australia, 2023. [Google Scholar]
- Smith IV, D.H.; Hao, Q.; Jagodzinski, F.; Liu, Y.; Gupta, V. Quantifying the effects of prior knowledge in entry-level programming courses. In Proceedings of the ACM Conference on Global Computing Education, Chengdu, China, 17–19 May 2019; pp. 30–36. [Google Scholar]
- Stoenoiu, C.E.; Jäntschi, L. Connecting the Computer Skills with General Performance of Companies—An Eastern European Study. Sustainability 2024, 16, 10024. [Google Scholar] [CrossRef]
- Chen, C.h. Influence of employees’ intention to adopt AI applications and big data analytical capability on operational performance in the high-tech firms. J. Knowl. Econ. 2024, 15, 3946–3974. [Google Scholar] [CrossRef]
- Aakula, A.; Saini, V.; Ahmad, T. The Impact of AI on Organizational Change in Digital Transformation. Internet Things Edge Comput. J. 2024, 4, 75–115. [Google Scholar]
- Schwaeke, J.; Gerlich, C.; Nguyen, H.L.; Kanbach, D.K.; Gast, J. Artificial intelligence (AI) for good? Enabling organizational change towards sustainability. Rev. Manag. Sci. 2025, 19, 3013–3038. [Google Scholar] [CrossRef]
- Sarker, I.H. Machine learning for intelligent data analysis and automation in cybersecurity: Current and future prospects. Ann. Data Sci. 2023, 10, 1473–1498. [Google Scholar] [CrossRef]











| Variable | Description and Relevance |
|---|---|
| Age | Age (in years) of respondent |
| Gender | Gender of respondent |
| Nationality | Nationality of respondent-=-binarized to UAE and non-UAE (Expat) |
| Experience | Working experience of respondent (in years) |
| Education | Education level attained by the respondent at the time of the course |
| Major | Main area of study |
| Income | Income level of respondents (in UAE Dirhams). USD ($)1 ≈ UAED 3.67 |
| Marital | Marriage status of respondent |
| Family | Number of people in the family, including respondent |
| Work | Sector in which respondent works |
| Job | Type of job of respondent |
| English | Respondent’s self-assessment of the English Technical communication proficiency |
| Computing | Respondent’s self-assessment of computing skills |
| Analytics | Respondent’s self-assessment of data analytics skills |
| Statistics | Respondent’s work-related statistical roles and responsibilities |
| StatSoftware | Respondent’s previous training in using any statistical package |
| Python | Respondent’s previous training in using machine learning packages like Python or R |
| Big Data | Respondent’s experience with Big Data |
| Decision | Respondent’s contribution in key decision making at work |
| Coding | Respondent’s ability to write and/or understand computer codes (programming) |
| ML | Respondent’s general understanding of Machine Learning |
| AI | Respondent’s general understanding of Artificial Intelligence |
| DL | Respondent’s general understanding of Deep Learning |
| Course Need | Respondent’s need to attend the Big Data Analytics course |
| Benefit | Respondent’s self-assessment of the benefit of attending the Big Data Analytics course |
| Score | Respondent’s performance on the final exam—scored out of 60 (categorical) |
| Results | Respondent’s performance on the final exam—scored out of 60 (numerical) |
| Education | Acceptable | Excellent | Good | Very Good | Weak | Total |
|---|---|---|---|---|---|---|
| Diploma | 0 | 0 | 0 | 0 | 1 | 1 |
| First Degree | 16 | 3 | 10 | 4 | 25 | 58 |
| High School | 1 | 0 | 2 | 0 | 4 | 7 |
| Masters | 2 | 1 | 9 | 2 | 6 | 20 |
| Total | 19 | 4 | 21 | 6 | 37 | 87 |
| Secretary | −0.899 |
| Teacher | −0.462 |
| Services | −0.437 |
| Admin | 0.100 |
| Researcher | 0.208 |
| Accountant | 0.338 |
| Programmer | 0.630 |
| Engineer | 1.083 |
| weak | −0.419 |
| acceptable | −0.146 |
| very good | 0.431 |
| good | 0.542 |
| excellent | 1.083 |
| Secretary | −1.562 |
| Services | −0.807 |
| Teacher | −0.319 |
| Programmer | 0.070 |
| Researcher | 0.079 |
| Accountant | 0.373 |
| Admin | 0.505 |
| Engineer | 0.629 |
| acceptable | −0.705 |
| Weak | −0.026 |
| very good | 0.073 |
| Good | 0.221 |
| excellent | 0.884 |
| 48/60 | −0.467 |
| 24/60 | −0.421 |
| 60/60 | −0.189 |
| 36/60 | −0.079 |
| 54/60 | −0.062 |
| 30/60 | −0.014 |
| 18/60 | 0.313 |
| 42/60 | 1.391 |
| Profession | Value |
|---|---|
| Services | −0.493 |
| Programmer | −0.418 |
| Accountant | −0.319 |
| Teacher | −0.220 |
| Admin | 0.261 |
| Engineer | 0.591 |
| Researcher | 0.719 |
| Secretary | 2.844 |
| 48/60 | −0.505 |
| 30/60 | −0.459 |
| 36/60 | −0.409 |
| 60/60 | 0.052 |
| 54/60 | 0.101 |
| 24/60 | 0.551 |
| 18/60 | 0.699 |
| 42/60 | 0.711 |
| weak | −0.775 |
| acceptable | −0.369 |
| good | 0.071 |
| very good | 0.355 |
| excellent | 0.986 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Said, R.A.T.; Mwitondi, K.S.; Benseddik, L.; Chemlali, L. Building Data Literacy for Sustainable Development: A Framework for Effective Training. Data 2025, 10, 188. https://doi.org/10.3390/data10110188
Said RAT, Mwitondi KS, Benseddik L, Chemlali L. Building Data Literacy for Sustainable Development: A Framework for Effective Training. Data. 2025; 10(11):188. https://doi.org/10.3390/data10110188
Chicago/Turabian StyleSaid, Raed A. T., Kassim S. Mwitondi, Leila Benseddik, and Laroussi Chemlali. 2025. "Building Data Literacy for Sustainable Development: A Framework for Effective Training" Data 10, no. 11: 188. https://doi.org/10.3390/data10110188
APA StyleSaid, R. A. T., Mwitondi, K. S., Benseddik, L., & Chemlali, L. (2025). Building Data Literacy for Sustainable Development: A Framework for Effective Training. Data, 10(11), 188. https://doi.org/10.3390/data10110188

