Curriculum, Pedagogy, and Teaching/Learning Strategies in Data Science Education
Abstract
:1. Introduction
2. Related Work
3. Methodology
3.1. Keywords
3.2. Inclusion and Exclusion Criteria
- Inclusion criteria: document type (conference paper, article, review, and book chapter), documents written in English and Spanish, and time frame: 2005–2024.
- Exclusion criteria: undefined authors (the name of the author is not specified).
3.3. Search
3.4. Screening
3.5. Analysis
- RQ1: What is the evolution of research in data science education in terms of annual scientific growth, the countries and affiliations that contribute the most, and the most relevant publication sources?
- RQ2: What are the academic contributions to the field of data science education from the perspective of authors in terms of the number of publications, keywords, and citations per year?
- RQ3: What are the trend topics in the data science education field based on content analysis?
- RQ4: What are the recent advances and future challenges in pedagogy, curriculum, and teaching/learning strategies in the data science education field?
- RQ5: What are the core courses, skills, and concepts in the data science education field?
- Metadata analysis: a general analysis of the results obtained, considering annual scientific growth (identification of the number of documents published per year), scientific production per country (identification of contributions made by the top 20 countries), production per affiliation (identification of contributions made by the top 20 affiliations), and most relevant sources (identification of contributions made by the top 20 sources of publication).
- Academic contributions analysis: this analysis allows us to identify the scientific outputs of different authors, where they come from, what their contributions are, and which other authors they cited. This analysis is based on the contributions of authors over time (identification of contributions made by the top 20 relevant authors in each year), relevant keywords (most common words used in the studies), and relations among authors cited in such contributions.
- Content and document analysis: identification of most common topics with a thematic evolution based on the authors’ keywords. For this category, we used the results from the scoping review about specific data science courses, topics, and skills reported in the studies.
4. Results
4.1. Metadata Analysis
4.1.1. Dataset Description
4.1.2. Research Areas
4.1.3. Annual Scientific Growth
4.1.4. Scientific Production and Country Collaboration
4.1.5. Most Relevant Affiliations
4.1.6. Most Relevant Sources
4.2. Analysis of Authors
4.2.1. Most Contributing Authors
4.2.2. Analysis of Authors’ Keywords
- Cluster 1: the focus in this cluster is educational technologies and analysis, which connects 17 topics (bibliometrics, blended learning, curriculum design, data sciences, design thinking, e-learning, educational data science, educational innovation, engineering education, higher education, interdisciplinarity, learning analytics, natural language processing, skills, teaching and learning, text mining, and virtual reality). According to these topics, teaching and learning strategies in educational data science involve the development of skills for conducting analysis by integrating approaches such as learning analytics, natural language processing, and text mining, among others. Moreover, there are other technologies that integrate data science elements (e.g., virtual reality allows developers to capture data from users’ interactions, and these data can be analyzed for multiple purposes).
- Cluster 2: in this case, the focus is also on the educational setting, but is more related to computing and programming education and tools or techniques for data analysis. This cluster connects 17 topics (academic performance, ChatGPT, computing, computing education, course design, curricula, data mining, data science, deep learning, ethics, experimental learning, gamification, programming, programming education, Python, R, visualization). There are some aspects of curriculum design in computing education, such as course design, the curricula, and some teaching and learning strategies (i.e., gamification and experimental learning). Moreover, some of the mentioned skills, tools, or techniques for data analysis are programming, data mining, deep learning, Python, R, and visualization.
- Cluster 3: this cluster addresses the topic of educational data science and its application in different settings. It includes 16 topics (AI, artificial intelligence, assessment, continuing education, curriculum, cybersecurity, data analysis, educational data mining, health informatics, IoT, medical education, pedagogy, professional development, research, STEM, training). Some of the application settings are health, medicine, and cybersecurity.
- Cluster 4: the main topic in this cluster is the skills and abilities for data science. It involves 15 topics (analytics, collaboration, computational thinking, computer science education, data ethics, data literacy, data management, data science education, digital transformation, information visualization, open data, statistical literacy, statistics education, statistics education research, teaching statistics). These topics reflect the learning path of someone who wants to dive deep into the data science world.
- Cluster 5: this cluster focuses on teaching strategies and pedagogical issues. This cluster has 12 topics (data science applications in education, distance education and online learning, evaluation methodologies, improving classroom teaching, learning communities, lifelong learning, pedagogical issues, postsecondary education, secondary education, statistical computing, teacher professional development, teaching/learning strategies). All these topics refer to the labor of data science teachers.
- Cluster 6: the main topic in this cluster is online technologies and platforms for data science education. It involves 10 topics (big data technology, data analytics, data visualization, Jupyter Notebook, MOOCs, online education, online learning, project-based learning, simulation, undergraduate education). Online education and online learning make it possible to pursue data science programs at different levels, such as is the case in undergraduate education, and the massive open online courses (MOOCs) bring the opportunity to pursue educational programs in an online setting.
- Cluster 7: this cluster deals with interdisciplinary approaches that contribute to the data science education field, and, particularly, with curriculum design. This cluster is composed of nine topics (big data, cloud computing, computer science, curriculum development, the EDISON Data Science Framework, education, informatics, interdisciplinarity, statistics). Here, interdisciplinarity plays an important role because the data science education field does not function as a singular field; it depends on other fields that complement the educational path for the data scientist.
- Cluster 8: this cluster addresses technology in education in the new era and how it has been influenced by health and societal issues such as COVID-19. Eight topics are included in this cluster (COVID-19, educational technology, ICT, learning, machine learning, MOOCs, teaching, technology). The MOOC term appears again in this cluster, as an indicator of platforms that serve online education.
- Cluster 9: this cluster has an analytical focus. It includes six topics (active learning, big data analytics, business intelligence, content analysis, information sciences, information technology). These topics are related to techniques for data analysis that are also part of the knowledge that a data scientist should acquire in the learning process.
4.2.3. Pedagogical Aspects and Teaching/Learning Strategies
4.3. Content and Document Analysis
4.3.1. Thematic Evolution
4.3.2. Courses, Topics, and Skills in Data Science Education
5. Discussion
5.1. RQ1: What Is the Evolution of Research in Data Science Education in Terms of Annual Scientific Growth, the Countries and Affiliations That Contribute the Most, and the Most Relevant Publication Sources?
5.2. RQ2: What Are the Academic Contributions to the Field of Data Science Education from the Perspective of Authors in Terms of the Number of Publications, Keywords, and Citations per Year?
5.3. RQ3: What Are the Trend Topics in the Data Science Education Field Based on Content Analysis?
5.4. RQ4: What Are the Recent Advances and Future Challenges in Pedagogy, Curriculum, and Teaching/Learning Strategies in the Data Science Education Field?
5.5. RQ5: What Are the Core Courses, Skills, and Concepts in the Data Science Education Field?
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959–975. [Google Scholar] [CrossRef]
- Asamoah, D. A., Doran, D., & Schiller, S. (2020). Interdisciplinarity in data science pedagogy: A foundational design. Journal of Computer Information Systems, 60(4). Available online: https://recursosvirtuales.konradlorenz.edu.co:2418/doi/abs/10.1080/08874417.2018.1496803 (accessed on 5 November 2024).
- Azevedo, A., & Azevedo, J. M. (2021). Learning analytics: A bibliometric analysis of the literature over the last decade. International Journal of Educational Research Open, 2, 100084. [Google Scholar] [CrossRef]
- Baako, T.-M. D., Kulkarni, S. K., McClendon, J. L., Harcum, S. W., & Gilmore, J. (2024). Machine learning and deep learning strategies for Chinese hamster ovary cell bioprocess optimization. Fermentation, 10, 234. [Google Scholar] [CrossRef]
- Bai, L., & Hu, Y. (2018, May 19–20). Problem-driven teaching activities for the capstone project course of data science. ACM Turing Celebration Conference—China (pp. 130–131), Shanghai, China. [Google Scholar] [CrossRef]
- Berikan, B., & Özdemir, S. (2020). Investigating “problem-solving with datasets” as an implementation of computational thinking: A literature review. Journal of Educational Computing Research, 58(2), 502–534. [Google Scholar] [CrossRef]
- Bile Hassan, I., Ghanem, T., Jacobson, D., Jin, S., Johnson, K., Sulieman, D., & Wei, W. (2021). data science curriculum design: A case study. In SIGCSE 2021—Proceedings of the 52nd ACM technical symposium on computer science education (pp. 529–534). Association for Computing Machinery, Inc. [Google Scholar] [CrossRef]
- Boaler, J., Conte, K., Cor, K., Dieckmann, J. A., LaMar, T., Ramirez, J., & Selbach-Allen, M. (2024). Studying the opportunities provided by an applied high school mathematics course: Explorations in data science. Journal of Statistics and Data Science Education, 33, 26–45. [Google Scholar] [CrossRef]
- Bonnell, J., Ogihara, M., & Yesha, Y. (2022). Challenges and Issues in Data Science Education. Computer, 55(2), 63–66. [Google Scholar] [CrossRef]
- Boztaş, G. D., Berigel, M., & Altınay, F. (2024). A bibliometric analysis of educational data mining studies in global perspective. Education and Information Technologies, 29(7), 8961–8985. [Google Scholar] [CrossRef]
- Burr, W., Chevalier, F., Collins, C., Gibbs, A. L., Ng, R., & Wild, C. J. (2021). Computational skills by stealth in introductory data science teaching. Teaching Statistics, 43(S1), S34–S51. [Google Scholar] [CrossRef]
- Chao, L., Xing, C., Zhang, Y., & Zhang, C. (2020). Data science: State of the art and trends. Data Science and Informetrics, 1(1), 22–49. [Google Scholar]
- Christozov, D. G., Rasheva-Yordanova, K., & Toleva-Stoimenova, S. (2019). Challenges in designing curriculum for trans-disciplinary education: On cases of designing concentration on informing science and master program on data science. Informing Science: The International Journal of an Emerging Transdiscipline, 22, 19–30. [Google Scholar] [CrossRef]
- Çetinkaya-Rundel, M., & Rundel, C. (2018). Infrastructure and tools for teaching computing throughout the statistical curriculum. The American Statistician. Available online: https://recursosvirtuales.konradlorenz.edu.co:2418/doi/abs/10.1080/00031305.2017.1397549 (accessed on 10 November 2024).
- Demchenko, Y., Belloum, A., de Laat, C., Loomis, C., Wiktorski, T., & Spekschoor, E. (2017, December 11–14). Customisable data science educational environment: From competences management and curriculum design to virtual labs on-demand. 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom) (pp. 363–368), Hong Kong. [Google Scholar] [CrossRef]
- Demchenko, Y., Comminiello, L., & Reali, G. (2019, March 27–29). Designing customisable data science curriculum using ontology for data science competences and body of knowledge. 2019 International Conference on Big Data and Education (pp. 124–128), London, UK. [Google Scholar] [CrossRef]
- Demchenko, Y., Gruengard, E., & Klous, S. (2014, December 15–18). Instructional model for building effective big data curricula for online and campus education. 2014 IEEE 6th International Conference on Cloud Computing Technology and Science (pp. 935–941), Singapore. [Google Scholar] [CrossRef]
- Demchenko, Y., José, C. G. J., Brewer, S., & Wiktorski, T. (2021a, April 21–23). EDISON data science framework (EDSF): Addressing demand for data science and analytics competences for the data driven digital economy. 2021 IEEE Global Engineering Education Conference (EDUCON) (pp. 1682–1687), Vienna, Austria. [Google Scholar] [CrossRef]
- Demchenko, Y., Maijer, M., & Comminiello, L. (2021b, February 3–5). Data scientist professional revisited: Competences definition and assessment, curriculum and education path design. 2021 4th International Conference on Big Data and Education (pp. 52–62), London, UK. [Google Scholar] [CrossRef]
- De Veaux, R., Agarwal, M., Averett, M., Baumer, B., Bray, A., Bressoud, T., Bryant, L., Cheng, L., Francis, A., Gould, R., Kim, A., Kretchmar, R., Lu, Q., Moskol, A., Nolan, D., Pelayo, R., Raleigh, S., Sethi, R., Sondjaja, M., . . . Ye, P. (2017). Curriculum guidelines for undergraduate programs in data science. Annual Review of Statistics and Its Application, 4. [Google Scholar] [CrossRef]
- Dogucu, M., Demirci, S., Bendekgey, H., Ricci, F. Z., & Medina, C. M. (2024). A systematic literature review of undergraduate data science education research. Available online: https://www.semanticscholar.org/paper/A-Systematic-Literature-Review-of-Undergraduate-Dogucu-Demirci/3f9629641456ccfc66b0afb58206c0e2b692609b (accessed on 2 December 2024).
- Dogucu, M., Johnson, A. A., & Ott, M. (2023). Framework for accessible and inclusive teaching materials for statistics and data science courses. Journal of Statistics and Data Science Education, 31, 144–150. [Google Scholar] [CrossRef]
- Donoghue, T., Voytek, B., & Ellis, S. E. (2021). Teaching creative and practical data science at scale. Journal of Statistics and Data Science Education. Available online: https://recursosvirtuales.konradlorenz.edu.co:2418/doi/abs/10.1080/10691898.2020.1860725 (accessed on 5 December 2024). [CrossRef]
- Drummond, D. E. (2016, October 12–15). Open sourcing education for data engineering and data science. 2016 IEEE Frontiers in Education Conference (FIE) (p. 1), Eire, PA, USA. [Google Scholar] [CrossRef]
- Echeverria, F., Kao, Y., & Hubbard Cheuoua, A. (2023, March 15–18). Using student and teacher feedback to modify CS curriculum. SIGCSE 2023—Proceedings of the 54th ACM Technical Symposium on Computer Science Education (Vol. 2, p. 1420), Toronto, ON, Canada. [Google Scholar] [CrossRef]
- Echeverria, V., Martinez-Maldonado, R., & Buckingham Shum, S. (2017, November 28–December 1). Towards data storytelling to support teaching and learning. 29th Australian Conference on Computer-Human Interaction (pp. 347–351), Brisbane, QLD, Australia. [Google Scholar] [CrossRef]
- Ellegaard, O., & Wallin, J. A. (2015). The bibliometric analysis of scholarly production: How great is the impact? Scientometrics, 105(3), 1809–1831. [Google Scholar] [CrossRef]
- Fernandez, C., Freitas, J., Blikstein, P., & de Deus Lopes, R. (2024). The design space of visualization tools for data science education: Literature review and framework for future designs. International Journal of Child-Computer Interaction, 100698. [Google Scholar] [CrossRef]
- Fitzgerald, B. K., Barkanic, S., Cárdenas-Navia, I., Chen, J., Elzey, K., Hughes, D., & Troyan, D. (2016). The BHEF national higher education and workforce initiative: A model for pathways to baccalaureate attainment and high-skill careers in emerging fields, Part 3. Industry and Higher Education, 30(6), 433–439. [Google Scholar] [CrossRef]
- Friedman, A. (2019). Data science syllabi measuring its content. Education and Information Technologies, 24(6), 3467–3481. [Google Scholar] [CrossRef]
- Frischemeier, D., Biehler, R., Podworny, S., & Budde, L. (2021). A first introduction to data science education in secondary schools: Teaching and learning about data exploration with CODAP using survey data. Teaching Statistics, 43(S1), S182–S189. [Google Scholar] [CrossRef]
- Golubski, C. (2016, October 12–15). Using inquiry-based learning in engineering statistics courses. 2016 IEEE Frontiers in Education Conference (FIE) (pp. 1–3), Erie, PA, USA. [Google Scholar] [CrossRef]
- Guzman, L. M., Pennell, M. W., Nikelski, E., & Srivastava, D. S. (2019). Successful integration of data science in undergraduate biostatistics courses using cognitive load theory. CBE—Life Sciences Education, 18(4), ar49. [Google Scholar] [CrossRef] [PubMed]
- Gymrek, M., & Farjoun, Y. (2016). Recommendations for open data science. GigaScience, 5(1), s13742-016-0127-4. [Google Scholar] [CrossRef]
- Hagen, L. (2020). Teaching undergraduate data science for information schools. Education for Information, 36(2), 109–117. [Google Scholar] [CrossRef]
- Hardin, J., Hoerl, R., Horton, N. J., Nolan, D., Baumer, B., Hall-Holt, O., Murrell, P., Peng, R., Roback, P., Temple Lang, D., & Ward, M. D. (2015). Data science in statistics curricula: Preparing students to “think with data”. The American Statistician, 69(4), 343–353. [Google Scholar] [CrossRef]
- Hazzan, O., & Mike, K. (2020). Ten challenges of data science education. Communications of the ACM. Available online: https://cacm.acm.org/blogcacm/ten-challenges-of-data-science-education/ (accessed on 3 October 2024).
- Hazzan, O., & Mike, K. (2021). A journal for interdisciplinary data science education. Communications of the ACM, 64(8), 10–11. [Google Scholar] [CrossRef]
- Heinemann, B., Opel, S., Budde, L., Schulte, C., Frischemeier, D., Biehler, R., Podworny, S., & Wassong, T. (2018, November 22–25). Drafting a data science curriculum for secondary schools. 18th Koli Calling International Conference on Computing Education Research (pp. 1–5), Koli, Finland. [Google Scholar] [CrossRef]
- Hicks, S. C., & Irizarry, R. A. (2018). A guide to teaching data science. The American Statistician, 72(4), 382–391. [Google Scholar] [CrossRef] [PubMed]
- Horton, N. J. (2022). 30 Years of the journal of statistics and data science education. Journal of Statistics and Data Science Education, 30(1), 1–2. [Google Scholar] [CrossRef]
- Horton, N. J., & Hardin, J. S. (2021). Integrating computing in the statistics and data science curriculum: Creative structures, novel skills and habits, and ways to teach computational thinking. Journal of Statistics and Data Science Education, 29, S1–S3. [Google Scholar] [CrossRef]
- Hoyt, R., & Wangia-Anderson, V. (2018). An overview of two open interactive computing environments useful for data science education. JAMIA Open, 1(2), 159–165. [Google Scholar] [CrossRef]
- Hsu, Y.-C. (2024). Mapping the landscape of data science education in higher general education in taiwan: A comprehensive syllabi analysis. Education Sciences, 14, 763. [Google Scholar] [CrossRef]
- Huppenkothen, D., Arendt, A., Hogg, D. W., Ram, K., VanderPlas, J. T., & Rokem, A. (2018). Hack weeks as a model for data science education and collaboration. Proceedings of the National Academy of Sciences USA, 115(36), 8872–8877. [Google Scholar] [CrossRef] [PubMed]
- Kaplan, D. (2018). Teaching Stats for Data Science. The American Statistician, 72(1), 89–96. [Google Scholar] [CrossRef]
- Kauermann, G., & Seidl, T. (2018). Data science: A proposal for a curriculum. International Journal of Data Science and Analytics, 6(3), 195–199. [Google Scholar] [CrossRef]
- Kenett, R. S., & Shmueli, G. (2016). Integrating InfoQ into data science analytics programs, research methods courses, and more. Wiley Data and Cybersecurity. [Google Scholar] [CrossRef]
- Ki Kim, S., Kim, T., & Kim, K. (2023, February 20–23). Analysis of teaching and learning environment for data science and AI education (focused on 2022 revised curriculum). 5th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2023 (pp. 788–790), Bali, Indonesia. [Google Scholar] [CrossRef]
- Klašnja-Milićević, A., Ivanović, M., & Budimac, Z. (2017). Data science in education: Big data and learning analytics. Computer Applications in Engineering Education, 25(6), 1066–1078. [Google Scholar] [CrossRef]
- Kloefkorn, T., Boardman, M., Horton, N. J., & Marshall, B. (2020). National academies’ roundtable on data science postsecondary education. In Proceedings of the 51st ACM technical symposium on computer science education (pp. 956–957). Association for Computing Machinery. [Google Scholar] [CrossRef]
- Kruskal, J. B., Berkowitz, S., Geis, J. R., Kim, W., Nagy, P., & Dreyer, K. (2017). Big data and machine learning—Strategies for driving this bus: A summary of the 2016 intersociety summer conference. Journal of the American College of Radiology, 14, 811–817. [Google Scholar] [CrossRef]
- Labou, S., Yoo, H. J., Minor, D., & Altintas, I. (2019, September 24–27). Sharing and archiving data science course projects to support pedagogy for future cohorts. 2019 15th International Conference on eScience (eScience) (pp. 644–645), San Diego, CA, USA. [Google Scholar] [CrossRef]
- Lee, V. R., & Delaney, V. (2022). Identifying the content, lesson structure, and data use within pre-collegiate data science curricula. Journal of Science Education and Technology, 31(1), 81–98. [Google Scholar] [CrossRef]
- Lewis, A., & Stoyanovich, J. (2021). Teaching responsible data science: Charting new pedagogical territory. International Journal of Artificial Intelligence in Education, 32, 783–807. [Google Scholar] [CrossRef] [PubMed]
- Li, X., Fan, X., Qu, X., Sun, G., Yang, C., Zuo, B., & Liao, Z. (2019). Curriculum reform in big data education at applied technical colleges and universities in China. IEEE Access, 7, 125511–125521. [Google Scholar] [CrossRef]
- Lilan, C., & Zhong, J. (2024). Intelligent recommendation system for College English courses based on graph convolutional networks. Heliyon, 10, e29052. [Google Scholar] [CrossRef] [PubMed]
- Lin, L., Zhou, D., Wang, J., & Wang, Y. (2024). A systematic review of big data driven education evaluation. Sage Open, 14(2), 21582440241242180. [Google Scholar] [CrossRef]
- Luo, Y., Han, X., & Zhang, C. (2024). Prediction of learning outcomes with a machine learning algorithm based on online learning behavior data in blended courses. Asia Pacific Education Review, 25, 267–285. [Google Scholar] [CrossRef]
- Marín-Marín, J.-A., López-Belmonte, J., Fernández-Campoy, J.-M., & Romero-Rodríguez, J.-M. (2019). Big data in education. A bibliometric review. Social Sciences, 8(8), 223. [Google Scholar] [CrossRef]
- Memarian, B., & Doleck, T. (2024). Data science pedagogical tools and practices: A systematic literature review. Education and Information Technologies, 29(7), 8179–8201. [Google Scholar] [CrossRef]
- Merryman, L., & Lu, S. (2021). Are fashion majors ready for the era of data science? A study on the fashion undergraduate curriculums in U.S. institutions. International Journal of Fashion Design, Technology and Education. Available online: https://recursosvirtuales.konradlorenz.edu.co:2418/doi/abs/10.1080/17543266.2021.1884752 (accessed on 5 December 2024). [CrossRef]
- Mike, K. (2020). Data Science Education: Curriculum and pedagogy. In Proceedings of the 2020 ACM conference on international computing education research (pp. 324–325). Association for Computing Machinery. [Google Scholar] [CrossRef]
- Mike, K., Hazan, T., & Hazzan, O. (2020, November 19–22). Equalizing data science curriculum for computer science pupils. Koli Calling ’20: Proceedings of the 20th Koli Calling International Conference on Computing Education Research (pp. 1–5), Koli, Finland. [Google Scholar] [CrossRef]
- Mike, K., Kimelfeld, B., & Hazzan, O. (2023). The birth of a new discipline: Data science education. Harvard Data Science Review, 5(4). [Google Scholar] [CrossRef]
- Mikroyannidis, A., Domingue, J., Phethean, C., Beeston, G., & Simperl, E. (2018). Designing and delivering a curriculum for data science education across Europe. In Teaching and learning in a digital world (pp. 540–550). Springer. [Google Scholar] [CrossRef]
- Msweli, N. T., Mawela, T., & Twinomurinzi, H. (2023). Data science education—A scoping review. Journal of Information Technology Education: Research, 22, 263–294. [Google Scholar] [CrossRef]
- Oliveira, O. J., de Silva, F. F., da Juliani, F., Barbosa, L. C. F. M., & Nunhes, T. V. (2019). Bibliometric method for mapping the state-of-the-art and identifying research gaps and trends in literature: An essential instrument to support the development of scientific projects. In Scientometrics recent advances. IntechOpen. [Google Scholar] [CrossRef]
- Perron, B. E., Victor, B. G., Hiltz, B. S., & Ryan, J. (2020). Teaching note—Data science in the msw curriculum: Innovating training in statistics and research methods. Journal of Social Work Education, 58(1), 193–198. [Google Scholar] [CrossRef]
- Raban, D., & Gordon, A. (2020). The evolution of data science and big data research: A bibliometric analysis. Scientometrics, 122(3), 1563–1581. [Google Scholar] [CrossRef]
- Raman, A., Thannimalai, R., Don, Y., & Rathakrishnan, M. (2021). A bibliometric analysis of blended learning in higher education: Perception, achievement and engagement. International Journal of Learning, Teaching and Educational Research, 20(6), 126–151. [Google Scholar] [CrossRef]
- Rampure, S., Shen, A., & Hug, J. (2021, March 17–21). Experiences teaching a large upper-division data science course remotely. 52nd ACM Technical Symposium on Computer Science Education (pp. 523–528), Toronto, ON, Canada. [Google Scholar] [CrossRef]
- Rao, Y. S. N., & Chen, C. J. (2024). Bibliometric insights into data mining in education research: A decade in review. Contemporary Educational Technology, 16(2), ep502. [Google Scholar] [CrossRef]
- Sakamaki, K., Taguri, M., Nishiuchi, H., Akimoto, Y., & Koizumi, K. (2022). Experience of distance education for project-based learning in data science. Japanese Journal of Statistics and Data Science, 5, 757–767. [Google Scholar] [CrossRef]
- Salas-Rueda, R.-A. (2021). Analysis of facebook in the teaching-learning process about mathematics through data science. Canadian Journal of Learning and Technology, 47(2). [Google Scholar] [CrossRef]
- Salas-Rueda, R.-A., Eslava-Cervantes, A.-L., & Prieto-Larios, E. (2020). Teachers’ perceptions about the impact of moodle in the educational field considering data science. Online Journal of Communication and Media Technologies, 10(4), e202023. [Google Scholar] [CrossRef]
- Saltz, J. S., Dewar, N. I., & Heckman, R. (2018, March 8–11). Key concepts for a data science ethics curriculum. 49th ACM Technical Symposium on Computer Science Education (pp. 952–957), Seattle, WA, USA. [Google Scholar] [CrossRef]
- Samsul, S. A., Yahaya, N., & Abuhassna, H. (2023). Education big data and learning analytics: A bibliometric analysis. Humanities and Social Sciences Communications, 10(1), 1–11. [Google Scholar] [CrossRef]
- Schwab-McCoy, A., Baker, C. M., & Gasper, R. E. (2021). Data science in 2020: Computing, curricula, and challenges for the next 10 years. Journal of Statistics and Data Science Education, 29, S40–S50. [Google Scholar] [CrossRef]
- Scopus. (n.d.a). Elsevier scopus blog. Available online: https://blog.scopus.com/about (accessed on 9 January 2025).
- Scopus. (n.d.b). Scopus|abstract and citation database|Elsevier. Available online: https://www.elsevier.com/products/scopus (accessed on 9 January 2025).
- Scopus. (2024). Scopus content|Elsevier. Www.Elsevier.Com. Available online: https://www.elsevier.com/products/scopus/content (accessed on 9 January 2025).
- Shao, G., Quintana, J. P., Zakharov, W., Purzer, S., & Kim, E. (2021). Exploring potential roles of academic libraries in undergraduate data science education curriculum development. The Journal of Academic Librarianship, 47(2), 102320. [Google Scholar] [CrossRef]
- Shao, Z., Yuan, S., Jin, Y., & Wang, Y. (2024). Scholar’s career switch from academia to industry: Mining and analysis from aminer. Big Data Research, 36, 100441. [Google Scholar] [CrossRef]
- Shea, K. D., Brewer, B. B., Carrington, J. M., Davis, M., Gephart, S., & Rosenfeld, A. (2019). A model to evaluate data science in nursing doctoral curricula. Nursing Outlook, 67(1), 39–48. [Google Scholar] [CrossRef]
- Sun, W., Ding, Y., Wang, R., Liu, Y., Wang, Y., Zhu, B., & Liu, Q. (2024). Bibliometric analysis of assessment and evaluation in higher education: 2012–2023. Assessment & Evaluation in Higher Education, 49(8), 1121–1135. [Google Scholar] [CrossRef]
- Tamai, T., Okamoto, K., Iuchi, K., & Kawada, K. (2021). Development of teaching material to design a vehicle on data science in junior high school technology education. IEEJ Transactions on Electrical and Electronic Engineering, 16(10), 1407–1413. [Google Scholar] [CrossRef]
- Tobar, F., Bravo-Marquez, F., Dunstan, J., Fontbona, J., Maass, A., Remenik, D., & Silva, J. F. (2021). Data science for engineers: A teaching ecosystem. IEEE Signal Processing Magazine, 38(3), 144–153. [Google Scholar] [CrossRef]
- Tolsgaard, M. G., Boscardin, C. K., Park, Y. S., Cuddy, M. M., & Sebok-Syer, S. S. (2020). The role of data science and machine learning in Health Professions Education: Practical applications, theoretical contributions, and epistemic beliefs. Advances in Health Sciences Education, 25, 1057–1086. [Google Scholar] [CrossRef] [PubMed]
- van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538. [Google Scholar] [CrossRef] [PubMed]
- Visser, M., van Eck, N. J., & Waltman, L. (2021). Large-scale comparison of bibliographic data sources: Scopus, web of science, dimensions, crossref, and microsoft academic. Quantitative Science Studies, 2(1), 20–41. [Google Scholar] [CrossRef]
- Walker, R. E. (2024, March 10–13). Mapping curricula to skills and occupations using course descriptions. EDUNINE 2024—8th IEEE World Engineering Education Conference: Empowering Engineering Education: Breaking Barriers Through Research and Innovation, Guatemala City, Guatemala. [Google Scholar] [CrossRef]
- West, J. (2018). Teaching data science: An objective approach to curriculum validation. Computer Science Education, 28(2), 136–157. [Google Scholar] [CrossRef]
- Wiktorski, T., Demchenko, Y., & Belloum, A. (2017, December 11–14). Model curricula for data science EDISON data science framework. 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom) (pp. 369–374), Hong Kong. [Google Scholar] [CrossRef]
- Williams, U., Brown, R., Davis, M., Pavri, T., & Shafiei, F. (2021). Teaching data science in political science: Integrating methods with substantive curriculum. PS: Political Science & Politics, 54(2), 336–339. [Google Scholar] [CrossRef]
- Williamson, B. (2015). Governing methods: Policy innovation labs, design and data science in the digital governance of education. Journal of Educational Administration and History, 47(3), 251–271. [Google Scholar] [CrossRef]
- Zhang, Y., Wu, D., Hagen, L., Song, I.-Y., Mostafa, J., Oh, S., Anderson, T., Shah, C., Bishop, B. W., Hopfgartner, F., Eckert, K., Federer, L., & Saltz, J. S. (2022). Data science curriculum in the iField. Journal of the Association for Information Science and Technology, 74, 641–662. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y., Zhang, T., Jia, Y., Sun, J., Xu, F., & Xu, W. (2017, May 20–28). DataLab: Introducing software engineering thinking into data science education at scale. 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering Education and Training Track (ICSE-SEET) (pp. 47–56), Buenos Aires, Argentina. [Google Scholar] [CrossRef]
Description | Results |
---|---|
Main information about data | |
Timespan | 2005:2024 |
Sources (journals, books, etc.) | 631 |
Documents | 1245 |
Average citations per document | 7.946 |
Average citations per year per doc | 2.99 |
References | 36,853 |
Document types | |
Article | 551 |
Book chapter | 47 |
Conference paper | 597 |
Review | 50 |
Document contents | |
Keywords Plus (ID) | 4821 |
Author’s keywords (DE) | 2886 |
Authors | |
Authors | 3980 |
Authors of single-authored documents | 205 |
Author collaboration | |
Single-authored documents | 227 |
Co-authors per document | 3.68 |
International co-authorships % | 16.31 |
Article Citation | Keywords | Type | Main Topic |
---|---|---|---|
(Kruskal et al., 2017) | ACR; big data; data science; deep learning; imaging informatics; Intersociety Committee; machine learning; radiology | Descriptive, summary | Discussion about applications of machine learning for image analysis. |
(Baako et al., 2024) | biomanufacturing; bioprocess engineering; Chinese hamster ovary (CHO) cells; data science; deep learning; multivariate statistical analysis; recombinant protein production | Review | Machine learning and deep learning in bioprocessing. |
(Mike, 2020) | computer science education; data science education | Conference paper | Pedagogical aspects of data science education. |
Country | Article | Keywords | Type | Main Topic |
---|---|---|---|---|
United States | (Kruskal et al., 2017) | ACR; big data; data science; deep learning; imaging informatics; Intersociety Committee; machine learning; radiology | Descriptive, summary | Discussion about applications of machine learning for image analysis |
(Baako et al., 2024) | biomanufacturing; bioprocess engineering; Chinese hamster ovary (CHO) cells; data science; deep learning; multivariate statistical analysis; recombinant protein production | Review | Machine learning and deep learning in bioprocessing | |
(F. Echeverria et al., 2023) | CRISP-DM, data mining, data visualization, database, information technology education, introductory data science | Survey | Integrating data science into a general education information technology course | |
(Li et al., 2019) | data science; data science education; middle school | Article | Student and teacher feedback to modify CS curriculum | |
China | (Luo et al., 2024) | Data science applications in education; Distributed learning environments; Evaluation methodologies; Interdisciplinary projects; Postsecondary education | Article, data analysis | Use of machine learning for predictions in education analyzing blended courses |
(Z. Shao et al., 2024) | Career mining; Data mining & analytics; Data science; Knowledge and technology transfer; Science of science; Scientific big data | Review | Knowledge and technology transfer and the research change of scholars | |
(Lilan & Zhong, 2024) | Data science applications in education; Distance education and online learning; Human-computer interface; Learning communities; Teaching/learning strategies | Article, analysis with neural networks | Recommendation systems; a graph convolutional neural network model based on college English course texts, students’ major, English foundation, and network structure characteristics | |
United Kingdom | (Mikroyannidis et al., 2018) | Courseware; Curricula; Data science; Demand analysis; Personalised learning pathways; Skills | Conference paper | Presentation of the initiative entitled European Data Science Academy (EDSA) for training new generations of data scientists |
(Dogucu et al., 2023) | Accessibility; Curriculum; Inclusion; Textbooks | Article, descriptive | Introducing a framework for developing accessible and inclusive course materials | |
(Demchenko et al., 2014) | Andragogy; Big data architecture framework; Bloom’s taxonomy; Common body of knowledge; Education and training on big data technologies; Instructional methodology; Online education | Conference paper | Description of topics for common body of knowledge for data science and big data technology domains |
Article | Keywords | Type | Main Topic |
---|---|---|---|
(Lee & Delaney, 2022) | Curriculum analysis; Data literacy; Data science education; Data science lessons; Secondary school; Statistics education | Article, analysis of data science curricula | Analysis of curricula and professional practice |
(Boaler et al., 2024) | Data science education; Math pathways; Mixed methods | Article, analysis of data science course | Analysis of a data science course in a high school for identifying who take more mathematics courses focusing on STEM |
(Tolsgaard et al., 2020) | Artificial intelligence; data science; Machine learning; Medical education research; Research in Health Professions Education | Article, critical review | Analysis of what roles both data science and machine learning play in health professions |
N° | Author | Name in Publications | ORCID | Affiliation | Region/Country |
---|---|---|---|---|---|
1 | SALAS-RUEDA RA | Ricardo-Adán Salas-Rueda | 0000-0002-4188-4610 | Instituto de Ciencias Aplicadas y Tecnología, Universidad Nacional Autónoma de México | Mexico |
2 | DEMCHENCKO Y | Yuri Demchenko | 0000-0001-7474-9506 | University of Amsterdam | Noord-Holland, Amsterdam |
3 | MIKE K | Koby Mike | 0000-0002-0977-9845 | Department of Education in Science and Technology, Technion | Haifa, Israel |
4 | HAZZAN O | Orit Hazzan | 0000-0002-8627-0997 | Department of Education in Science and Technology, Technion | Haifa, Israel |
5 | ALVARADO-ZAMORANO C | Clara Alvarado-Zamorano | 0000-0001-9122-7590 | Instituto de Ciencias Aplicadas y Tecnología, Universidad Nacional Autónoma de México | Mexico |
6 | WILLIAMSON B | Ben Williamson | 0000-0001-9356-3213 | Centre for Research in Digital Education, School of Education, University of Edinburgh | Edinburgh |
7 | BIEHLER R | Rolf Biehler | 0000-0002-9815-1282 | Paderborn University | Paderborn, Deutschland |
8 | CAR J | Josip Car | 0000-0001-8969-371X | Nanyang Technological University Imperial College London | Singapore Westminster, London, the United Kingdom |
9 | GUO PJ | Philip J. Guo | No information or no public ORCID profile | University of California | San Diego, the United States |
10 | HORTON NJ | Nicholas J. Horton | 0000-0003-3332-4311 | Department of Mathematics and Statistics, Amherst College | Amherst, the United States |
11 | LEE VR | Victor R. Lee | 0000-0001-6434-7589 | Stanford University | Stanford, the United States |
12 | RAJ RK | Rajendra K. Raj | 0000-0003-2378-1068 | Rochester Institute of Technology | Rochester, NY, the United States |
13 | DAVIS KC | Karen C. Davis | 0000-0003-2327-4429 | Computer Science and Software Engineering Department, Miami University | Oxford, the United States |
14 | DE-LA-CRUZ-MARTÍNEZ G | Gustavo De-La-Cruz-Martínez | 0000-0002-4446-7396 | Instituto de Ciencias Aplicadas y Tecnología, Universidad Nacional Autónoma de México | Mexico |
15 | DOGUCU M | Mine Dogucu | 0000-0002-8007-934X | University of California | Irvine, California, the United States |
16 | MEINERT E | Edward Meinert | 0000-0003-2484-3347 | University of Plymouth Newcastle University | Plymouth, the United Kingdom Newcastle, the United Kingdom |
17 | PFANNKUCH M | Maxine Pfannkuch | 0000-0002-2202-9678 | The University of Auckland | Auckland, New Zealand |
18 | SAKR M | Majd Sakr | 0000-0001-5150-8259 | Carnegie Mellon University | Pittsburgh, the United States |
19 | WU W | Wensheng Wu | 0000-0002-2948-9773 | Computer Science Department, University of Southern California | Los Angeles, the United States |
20 | ADAMS J | Joshua Adams | 0000-0002-7185-9125 | Saint Leo University | Saint Leo, the United States |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Avila-Garzon, C.; Bacca-Acosta, J. Curriculum, Pedagogy, and Teaching/Learning Strategies in Data Science Education. Educ. Sci. 2025, 15, 186. https://doi.org/10.3390/educsci15020186
Avila-Garzon C, Bacca-Acosta J. Curriculum, Pedagogy, and Teaching/Learning Strategies in Data Science Education. Education Sciences. 2025; 15(2):186. https://doi.org/10.3390/educsci15020186
Chicago/Turabian StyleAvila-Garzon, Cecilia, and Jorge Bacca-Acosta. 2025. "Curriculum, Pedagogy, and Teaching/Learning Strategies in Data Science Education" Education Sciences 15, no. 2: 186. https://doi.org/10.3390/educsci15020186
APA StyleAvila-Garzon, C., & Bacca-Acosta, J. (2025). Curriculum, Pedagogy, and Teaching/Learning Strategies in Data Science Education. Education Sciences, 15(2), 186. https://doi.org/10.3390/educsci15020186