Evaluating the Coverage and Depth of Latent Dirichlet Allocation Topic Model in Comparison with Human Coding of Qualitative Data: The Case of Education Research
Abstract
:1. Introduction
2. Overview of Traditional Research Methods in Education
3. Machine Learning and Natural Language Processing Approaches for Analyzing Qualitative Data
4. Methods
4.1. Context and Participants
4.2. Procedures and Data Collection Method
4.3. Approach Used for Traditional/Manual Coding
4.4. Approach Used for LDA Topic Modeling
5. Results
5.1. Students’ Perceptions of Potential Influences of Their Cultural Backgrounds in Their Teamwork Communication Styles
5.2. Students’ Perceptions of Potential Influences of Their Cultural Backgrounds on Their Interactions with Their Teammates
6. Discussion and Practical Implications
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Crowston, K.; Allen, E.E.; Heckman, R. Using natural language processing technology for qualitative data analysis. Int. J. Soc. Res. Methodol. 2012, 15, 523–543. [Google Scholar] [CrossRef]
- Gillies, M.; Murthy, D.; Brenton, H.; Olaniyan, R. Theme and Topic: How Qualitative Research and Topic Modeling Can Be Brought Together. arXiv 2022, arXiv:2210.00707. [Google Scholar]
- Chauhan, U.; Shah, A. Topic Modeling Using Latent Dirichlet Allocation: A Survey. ACM Comput. Surv. 2021, 54, 145. [Google Scholar] [CrossRef]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Nanda, G.; Douglas, K.A.; Waller, D.R.; Merzdorf, H.E.; Goldwasser, D. Analyzing Large Collections of Open-Ended Feedback from MOOC Learners Using LDA Topic Modeling and Qualitative Analysis. IEEE Trans. Learn. Technol. 2021, 14, 146–160. [Google Scholar] [CrossRef]
- Nanda, G.; Wei, S.; Katz, A.; Brinton, C.; Ohland, M. Work-in-Progress: Using Latent Dirichlet Allocation to Uncover Themes in Student Comments from Peer Evaluations of Teamwork. In Proceedings of the 2022 ASEE Annual Conference & Exposition, Minneapolis, MN, USA, 25–29 June 2022. [Google Scholar]
- Yang, S.; Zhang, H. Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis. Int. J. Comput. Inf. Eng. 2018, 12, 525–529. [Google Scholar]
- Kumar, R.; Raghuveer, K. Legal Document Summarization Using Latent Dirichlet Allocation. Int. J. Comput. Sci. Telecommun. 2012, 3, 8–23. [Google Scholar]
- Li, G.-K.J.; Trappey, C.V.; Trappey, A.J.; Li, A.A. Ontology-Based Knowledge Representation and Semantic Topic Modeling for Intelligent Trademark Legal Precedent Research. World Pat. Inf. 2022, 68, 102098. [Google Scholar] [CrossRef]
- Lebeña, N.; Blanco, A.; Pérez, A.; Casillas, A. Preliminary Exploration of Topic Modelling Representations for Electronic Health Records Coding According to the International Classification of Diseases in Spanish. Expert Syst. Appl. 2022, 204, 117303. [Google Scholar] [CrossRef]
- Chowdhury, M.F. Coding, sorting and sifting of qualitative data analysis: Debates and discussion. Qual. Quant. 2015, 49, 1135–1143. [Google Scholar] [CrossRef]
- Elliott, V. Thinking about the coding process in qualitative data analysis. Qual. Rep. 2018, 23, 2850–2861. [Google Scholar] [CrossRef]
- Cleland, J. Exploring versus Measuring: Considering the Fundamental Differences between Qualitative and Quantitative Research. In Researching Medical Education; Wiley Online Library: Hoboken, NJ, USA, 2015; pp. 1–14. [Google Scholar]
- Allwood, C.M. The Distinction between Qualitative and Quantitative Research Methods Is Problematic. Qual. Quant. 2012, 46, 1417–1429. [Google Scholar] [CrossRef]
- Evans, M.S. A computational approach to qualitative analysis in large textual datasets. PLoS ONE 2014, 9, e87908. [Google Scholar] [CrossRef] [PubMed]
- Chen, N.-C.; Drouhard, M.; Kocielnik, R.; Suh, J.; Aragon, C.R. Using Machine Learning to Support Qualitative Coding in Social Science: Shifting the Focus to Ambiguity. ACM Trans. Interact. Intell. Syst. 2018, 8, 9. [Google Scholar] [CrossRef]
- Crowston, K.; Liu, X.; Allen, E.E. Machine Learning and Rule-Based Automated Coding of Qualitative Data. In Proceedings of the American Society for Information Science and Technology, Pittsburgh, PA, USA, 22–27 October 2010; Volume 47, pp. 1–2. [Google Scholar]
- Rosenberg, J.M.; Krist, C. Combining Machine Learning and Qualitative Methods to Elaborate Students’ Ideas about the Generality of Their Model-Based Explanations. J. Sci. Educ. Technol. 2021, 30, 255–267. [Google Scholar] [CrossRef]
- Marathe, M.; Toyama, K. Semi-automated coding for qualitative research: A user-centered inquiry and initial prototypes. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; pp. 1–12. [Google Scholar]
- Baumer, E.P.; Mimno, D.; Guha, S.; Quan, E.; Gay, G.K. Comparing Grounded Theory and Topic Modeling: Extreme Divergence or Unlikely Convergence? J. Assoc. Inf. Sci. Technol. 2017, 68, 1397–1410. [Google Scholar] [CrossRef]
- Muller, M.; Guha, S.; Baumer, E.P.; Mimno, D.; Shami, N.S. Machine Learning and Grounded Theory Method: Convergence, Divergence, and Combination. In Proceedings of the 19th International Conference on Supporting Group Work, Sanibel Island, FL, USA, 13–16 November 2016; pp. 3–8. [Google Scholar]
- Nelson, L.K. Computational Grounded Theory: A Methodological Framework. Sociol. Methods Res. 2020, 49, 3–42. [Google Scholar] [CrossRef]
- Jaiswal, A.; Patel, D.; Zhu, Y.; Lee, J.S.; Magana, A. A Reflection on Action Approach to Teamwork Facilitation. In Proceedings of the 2022 ASEE Annual Conference & Exposition, Minneapolis, MN, USA, 25–29 June 2022. [Google Scholar]
- Khandkar, S.H. Open Coding; University of Calgary: Calgary, AB, Canada, 2009. [Google Scholar]
- Kherwa, P.; Bansal, P. Topic Modeling: A Comprehensive Review. In EAI Endorsed Transactions on Scalable Information Systems 2019, 7, e2. [Google Scholar] [CrossRef]
- Jelodar, H.; Wang, Y.; Yuan, C.; Feng, X.; Jiang, X.; Li, Y.; Zhao, L. Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, a Survey. Multimed. Tools Appl. 2019, 78, 15169–15211. [Google Scholar] [CrossRef]
- Chang, J.; Gerrish, S.; Wang, C.; Boyd-Graber, J.; Blei, D. Reading Tea Leaves: How Humans Interpret Topic Models. In Advances in Neural Information Processing Systems 22: Proceedings of the 23rd Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 7–10 December 2009.
- Mimno, D.; Wallach, H.; Talley, E.; Leenders, M.; McCallum, A. Optimizing Semantic Coherence in Topic Models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, 27–31 July 2011; pp. 262–272. [Google Scholar]
- Röder, M.; Both, A.; Hinneburg, A. Exploring the space of topic coherence measures. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining, Shanghai, China, 2–6 February 2015; pp. 399–408. [Google Scholar]
- Welcome to PyLDAvis’s Documentation PyLDAvis 2.1.2 Documentation. Available online: https://pyldavis.readthedocs.io/en/latest/ (accessed on 4 March 2023).
- Topic Modeling. Available online: https://mimno.github.io/Mallet/topics.html (accessed on 4 March 2023).
- Rahman, S.; Kandogan, E. Characterizing practices, limitations, and opportunities related to text information extraction workflows: A hu-man-in-the-loop perspective. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, Orleans, LA, USA, 29 April–5 May 2022. [Google Scholar]
Initial Codes | Final Code | Definition |
---|---|---|
Opportunity to communicate | Equal chances to speak | Everyone has equal chances to express their own ideas and thoughts. |
Everyone can share ideas | ||
Freedom to express |
Codes | Definition | Representative Quote |
---|---|---|
Leadership as needed | People stand out and lead the conversation when no one else tries to lead the conversation or take a leadership role when required. | “I think culture definitely plays a role in this process within our team because everyone has a different communication style despite the fact we are close in age and studies the same major while being in the same college… I could see that some members came from a culture where speaking up and speaking directly and have strong leadership skills, but someone members came from a culture that feels the opposite.” |
Unnoticeable cultural differences | There is no obvious influence that cultural background has on communication due to the similar culture that group members have. | “I don’t think our group is very different in the communication styles we grew up with two of our team members are from within an hour of the University, and two of us are from close to the same place in the United States.” |
Equal chances to speak | Everyone has equal chances to express their own ideas and thoughts. | “In my team, I can be pretty safe to assume that all of us have been raised in an American culture where a shift is being observed of more confrontation and less hierarchy. These two concepts can be observed in our team as we will interject when possible in order to add more to each thought, and we will treat each other equally.” |
Respect for others’ ideas | Show respect for each other’s different ideas and thoughts. | “Although each of us may have had a similar upbringing and thinking about situations, there are still some cultural differences that influence communication styles and help when solving problems we run into. In our group, we respect when other people are speaking and listen to what they have to say.” |
Trust in teammates | Showing trust for teammates. | “For me, I rely on constant communication and precise communication. I also rely on people holding true to their word. I always grew up where if you say something, you would do it, so when my teammates. say they will do something I trust that they will get it done.” |
Understanding others’ backgrounds | Understanding group members’ cultural backgrounds. | “I think that culture has influenced our process within our team is our work ethic Some people in the group come from a culture that prides itself on work ethic and getting stuff done early. This leads to some coming off really strong when communicating to the team because they want to get stuff done early rather than late.” |
Gender ratio within teams | The gender ratio would also influence the communication styles within a team. | “I think that the role of culture has chosen who takes charge during our meeting times. We have 3 males and 1 female in our group. The people who are speaking the most in the group are the males. We need to be aware that [female student name] has great ideas and need to give her the opportunity to share those ideas. Historically culture has said that males are the leaders and that is just not the case.” |
Online communication is more comfortable | Communication through online platforms helps individuals to communicate with more comfort. | “My team communicates over Teams or online platforms. Instead of choosing to meet in-person, we decided, especially in these times to meet over an online platform as it was easy for all team members.” |
Topic Id | Topic Weight | Top 20 Words | Theme | Manual Coding Phase 1 Codes |
---|---|---|---|---|
T1_R1 | 5.51729 | culture team communication group members work influenced role people cultures communicate process speak styles don’t similar things cultural time feel | Culture helps to see people from different perspectives. | Equal chances to speak, respect for others’ ideas, and trust in teammates. |
T2_R1 | 0.70183 | open member Chinese the role day Korean terms note handle found past manner outspoken talking American gave leader prefer mesome longer | Understanding culture promotes bonding. | None. |
T3_R1 | 0.65222 | understanding lot teamwork problem job exposed bit pretty rest there are makes barrier differently big their culture older personalities completely except our conditions | Culture has helped to develop an understanding of people from diverse backgrounds. | Understanding others’ backgrounds. |
T4_R1 | 0.6157 | conversation states teamwork directly persons respect interrupt give students mind current comfortable contribute lack reason due moment case equal worked | The culture has influenced how we talk to one another on the team. | Understanding others’ backgrounds. |
T5_R1 | 0.56384 | males works aspect worry south aspects decided online expected significant speaking likes conflict hasn’t issues great project extremely communicate discussed | Culture can foster stereotypes in leadership roles. | Leadership as needed, gender ratio within teams. |
T6_R1 | 0.552 | scrum complete upbringing times learned tasks situations topics messaging groupmate grow age create collective decision solving meetings talk conflict groups | Culture can cause communication challenges. | None. |
None. | None. | Unnoticeable cultural differences. | ||
None. | None. | Online communications are more comfortable. |
Codes | Definition | Representative Quotes |
---|---|---|
Parental influence | Some of a person’s qualities are inherited from parents and family members. Parents’ views about life and tasks influence their children’s attitudes and behavior. | “As a child of immigrant parents from China, I grew up under parents who communicated with me in a very direct and simple way. Growing up in essentially a Chinese household, I think this has contributed to my team communication style, where I am very upfront with other members on what needs to be done and when. Obviously being direct sometimes comes with a negative connotation given how stern it sounds, but I think my American culture has made it so that I can be more upfront with others while being wary of how my comments can be perceived by others.” |
Listening rather than speaking | Certain individuals prefer to listen to others rather than express their own opinions. | “I think that my cultural background has influenced our teamwork interaction. I am used to being in team environments where people do not like to talk a lot but listen.” |
Shared responsibilities among teammates | Each participant has an equal role and responsibility. | “I come from a background that prioritizes equality between members, so I believe that everyone in our team should have the same amount of say when it comes to decisions and contribute equally to projects.” |
Respect for others’ ideas | Showing respect for each other’s different ideas and thoughts. | “With my cultural background, I believe it causes me to be very respectful of what others have to say in my team. I think making sure I treat everyone with respect is crucial in a team setting.” |
Task-Oriented and individualistic views | Certain students use direct communication and feel more at ease and efficient while working alone. Additionally, some individuals are more concerned with completing the work than with team collaboration. | “Since I personally come from the southern United States, I might have a slightly different culture than my teammates. It’s more acceptable to be more open with people, so I may be being more direct with my teammates than they are used to.” |
Upbringing experiences | The location of one’s upbringing or the setting in which an individual grew up. | “Personally, I grew up in a pretty traditional family environment consisting of partially progressive partial American dreamer family culture. There is a lot of emphasis placed on the importance of education and the value of hard work and close relationships with family. Therefore, this may influence teamwork interaction because it encourages me to update team members on what I have done and ensure that I complete my roles on time and keep up with our team tasks.” |
Topic Id | Topic Weight | Top 20 Words | Theme | Manual Coding Phase 1 Codes |
---|---|---|---|---|
T1_R2 | 0.38247 | leadership comfortable found process master roles adapt complete promotes change confident role kind students states ive taking start point good | Family background and upbringing helped students to develop teamwork skills. | Upbringing experiences, parental influence. |
T2_R2 | 3.47678 | background team cultural culture work teamwork people influence group interaction don’t communication tend speak members family raised make things influences | My cultural background has made me less communicative. | Upbringing experiences, parental influence, and listening rather than speaking. |
T3_R2 | 0.48292 | part years leader social problems thing focused active listen experience issues loud diverse teamwork interaction greatly time considered line paced fast | My family background taught me collaboration. | Parental influence. |
T4_R2 | 0.48133 | time working individual understanding heavily Mexico contribute style work differences related told attending aggressive independent accustomed USA focused interacting high | Personal experiences influence teamwork interaction. | Upbringing experiences. |
T5_R2 | 0.39661 | household bit parents made end everyone style task Kenyan hierarchy nontraditional negative provide toes environment school ive opinions large grew | Teamwork means sharing responsibilities. | Shared responsibilities among teammates. |
T6_R2 | 0.37257 | communicate share Turkish generally submitted realized setting efficiently assume culturally works video interaction because interactions makes likes raised awkward reason tend | High school experiences shaped my teamwork skills. | Upbringing experiences. |
T7_R2 | 0.35795 | point cultural side advocate finish polish effectively personally teams complicated technology responding hard music subcultures play leader taught culture opposite | My cultural background has taught me to respect others. | Respect for others’ ideas. |
T8_R2 | 0.34233 | follow affect groupmate slightly sort observing working create role are more learned born leads positive personal decision needed power create confusion methods very open | My culture has taught me to be open and direct. | Task-oriented and individualistic views |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nanda, G.; Jaiswal, A.; Castellanos, H.; Zhou, Y.; Choi, A.; Magana, A.J. Evaluating the Coverage and Depth of Latent Dirichlet Allocation Topic Model in Comparison with Human Coding of Qualitative Data: The Case of Education Research. Mach. Learn. Knowl. Extr. 2023, 5, 473-490. https://doi.org/10.3390/make5020029
Nanda G, Jaiswal A, Castellanos H, Zhou Y, Choi A, Magana AJ. Evaluating the Coverage and Depth of Latent Dirichlet Allocation Topic Model in Comparison with Human Coding of Qualitative Data: The Case of Education Research. Machine Learning and Knowledge Extraction. 2023; 5(2):473-490. https://doi.org/10.3390/make5020029
Chicago/Turabian StyleNanda, Gaurav, Aparajita Jaiswal, Hugo Castellanos, Yuzhe Zhou, Alex Choi, and Alejandra J. Magana. 2023. "Evaluating the Coverage and Depth of Latent Dirichlet Allocation Topic Model in Comparison with Human Coding of Qualitative Data: The Case of Education Research" Machine Learning and Knowledge Extraction 5, no. 2: 473-490. https://doi.org/10.3390/make5020029
APA StyleNanda, G., Jaiswal, A., Castellanos, H., Zhou, Y., Choi, A., & Magana, A. J. (2023). Evaluating the Coverage and Depth of Latent Dirichlet Allocation Topic Model in Comparison with Human Coding of Qualitative Data: The Case of Education Research. Machine Learning and Knowledge Extraction, 5(2), 473-490. https://doi.org/10.3390/make5020029