# Shared Language: Linguistic Similarity in an Algebra Discussion Forum

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Lexical Entrainment and Semantic Similarity

#### 1.2. Math Discussion Boards and Math Language Processing (MLP)

#### 1.3. Current Study

- Does discourse within an online Math tutoring platform exhibit shared language?
- 1.1.
- Do the posts become more linguistically similar as the discussion progresses?
- 1.2.
- Are words indicative of similarity between the discourse participants’ posts?

- How is linguistic similarity associated with known desirable social constructs (affect measures and trust) related to student engagement and feelings of affinity and belonging within discourse communities?

## 2. Methods

#### 2.1. Math Nation

#### 2.2. Participants

#### 2.3. Discussion Threads

_{1}, p

_{2}, … p

_{n}) that varies in length. Posts are timestamped to reveal their sequence, but the time interval between each post is varied. There are 4305 threads with a total of 50,975 posts, of which 4244 are the “initial” posts, and 46,731 are the “subsequent replies”. The majority of these posts are from the students (50,950), with the mentors having only 25 posts. The average turn-around time from the initial post to the last reply in a thread is 5.72 h. However, some outlying threads took longer than usual, with the greatest time interval between the original post and the last post being 70 days and the shortest interval being 7 min. The number of posts in the threads ranged from 1 to 142, with a mean of 12 posts (SD = 14.5). With the high variability in the number of posts per thread, we used threads with at least 4 posts (1st quartile) and at most 16 posts (3rd quartile). Hence, 2139 threads were included in this analysis. These exclusion criteria eliminated threads that were extremely short or lengthy.

#### 2.4. Natural Language Processing

#### 2.4.1. Feature Engineering

#### 2.4.2. Text Classification

**Not Similar**representing threads with a mean similarity score below the overall mean and

**Similar**representing threads that are greater than or equal to the overall mean similarity score (i.e., 0 would be threads that are not semantically similar on average, and 1 would be threads that are, on average, semantically similar). We qualitatively evaluated the validity of the dichotomous classes by randomly checking a select number of posts (n = 20) for each class and found that the dichotomous categories appropriately represented the semantic similarity of the threaded discussions. We ensured that the qualitative verification involved threads with outlying extreme aggregated similarity values and values that are nearest to the dichotomization cut-off/threshold mean similarity score. The feature selection/dimensionality reduction method used was SelectKBest, which selects the best features based on the X

^{2}value. Using the reduced feature matrix, the logistic regression classification model was trained with 10-fold cross-validation at the thread-level repeated three times. Finally, the model performance was evaluated using the F1-score as the metric.

#### 2.4.3. Model Explainability

^{2}), we wanted to supplement the findings by understanding word contributions at the local prediction level. LIME provides a means to explain and visualize individual predictions. Even with acceptable levels of accuracy and other measures of performance, model explainability at prediction-level granularity provides more relevant and useful insights. LIME interprets individual instances of predictions without looking into the model and learns an interpretable model to explain individual predictions [45]. We ran repeated iterations of LIME to find more stable results to facilitate the interpretations.

#### 2.5. Cluster Analysis

## 3. Results

#### 3.1. Similarity as Proxy for Shared Language in Discourse

- RQ#1 Does discourse within an online Math tutoring platform exhibit shared language?

_{n}to p

_{n + 1}). The sequence of pairwise similarity scores reflects the sequence of the posts as revealed in the timestamps. The means of both post similarity measures have minimal differences (OP similarity M = 0.1777; SD = 0.0915; min = −0.0520; max = 0.7618; P2P similarity M = 0.2108; SD = 0.1134; min = −0.0627; max = 0.7380). The two post similarity scores are positively correlated (r = 0.41, p < 0.001). The minimal difference and the positive correlation between OP similarity and P2P similarity reflect that the similarity of individual posts to the original post is almost the same as the similarity of individual posts to subsequent replies (posts). Hence, if a post is similar to the original post, then similarity with the succeeding post is also noted.

#### Similarity Scores Variance Analysis

^{2}(2110) = 3192, p< 0.001; Levene’s F statistic (2110, 11797) = 2). These results reiterate the significant differences in the variances of similarity scores of posts within the threads. In other words, the linguistic similarity scores of the posts were different across the discussion threads, such that some threads were linguistically similar and some were not.

- RQ#1.1 Do the posts become more linguistically similar as the discussion progresses?

#### 3.2. Similarity Predictive Model

- RQ#1.2 Are words indicative of similarity between discourse participants’ posts?

_{n}to p

_{n + 1}) to determine the turn-by-turn progression of similarity. First, we generated a dichotomous similarity outcome variable by categorizing the similarity scores into two bins (i.e., similar and dissimilar). The category was similar if the score was greater than or equal to the average score; otherwise, the category was dissimilar. This dichotomous categorization distinguishes threads that depict lexical entrainment and shared language versus the threads that were less similar than the average tendency. We then extracted the TF-IDF features from the text and used SelectKBest to use the best features (i.e., higher X

^{2}values) for classification. The top 10 best word features (i.e., unigrams and bigrams) are presented in Table 1. Logistic regression was used to fit the model with three repeats of 10-fold cross-validation. The final logistic regression model achieved a good fit (accuracy = 0.73, F1-score = 0.67).

#### Local Analysis of Predictions

**Similar**(Pr(

**Similar**) = 0.4230), and it has been incorrectly predicted as

**Not Similar**(Pr(

**NotSimilar**) = 0.5770). However, looking at the probabilities, the difference between the classes is relatively small. The words that contribute to the classification decision are presented in Table 3.

**Not Similar**(Pr(

**Not Similar**) = 0.7195; (Pr(

**Similar**) = 0.2805), and it is correctly predicted as such. It can be observed that in this correct prediction, the probability of the true class is notably higher than the probability of the incorrect class. The words that contribute to the prediction are presented in Table 5.

**Not Similar,**and it has been correctly predicted as such Pr(

**Not Similar**) = 0.5028; Pr(

**Similar**) = 0.4972. The difference between the probabilities of both classes is relatively small. The words that contribute to the prediction are presented in Table 7.

#### 3.3. Cluster Analysis

- RQ2: How is linguistic similarity associated with known desirable social constructs (affect measures and trust) related to student engagement and feelings of affinity and belonging within discourse communities?

## 4. Discussion

- Shared language in the Math Wall discourse

- Lexical choice as a predictor of linguistic similarity

- Linguistic similarity and desirable social constructs

## 5. Limitations, Conclusions, and Future Work

- Implications for Discussion Boards as a Pedagogical Tool

- Implications to Scalability of NLP Language Models for the Mathematics Domain

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A. Sample Threads Per Cluster

Post | I am so confused please help |

Reply | Hello <<Student1>>, have you watched Section 1, Topic 2? |

Reply | Yes, a while ago. I can go watch it again. |

Reply | No, that’s ok |

Reply | Okay, thank you! |

Reply | Or you could. Alright, if you still have questions, be sure to come back |

Post | I need help solving this. |

Reply | Hello <<Student1>>, your first step is to remove the parenthesis around the exponent 1, so the exponents combine to make 1 times 3/4 |

Reply | Ok after I do that what do I do? |

Reply | Ok, so then I get 3/4 as the answer right? |

Reply | Yes I got that |

Reply | So, (1/p squared+ 1/p) to the power of p |

Reply | I am stuck on this problem too. What would be the next step? |

Reply | (28/9) raised to the power 3/4? |

Reply | Thank you |

Reply | After that step I got (28/9) raised to the power 3/4 |

Reply | any way to show how you got to 28/9? |

Reply | Is the final answer (3.111) raised to the power ¾ |

Reply | Thank you very much |

Reply | <<Student2>> when you do (1/9/16 + 1/3/4) to the power of 3/4 you will get (28/9) raised to the power ¾ |

Reply | Thank you, <<Student1>> |

Post | How do i solve a to the second power x a to the third power? |

Reply | do you mean |

Reply | Yes |

Reply | when multiplying same base exponents, you add the exponent values |

Reply | x¬≤ *x= x¬≥ |

Reply | were you talking to me? that was an example |

Reply | its ok |

## Appendix B. Two-step Cluster Analysis ANOVA and Effect Sizes

Construct | Sum of Squares | df | Mean Square | F (p < 0.001) | Eta-Squared | |
---|---|---|---|---|---|---|

Similarity | Between Groups | 14.069 | 2 | 7.035 | 761.140 | 0.285 |

Within Groups | 35.370 | 3827 | 0.009 | |||

Total | 49.439 | 3829 | ||||

Trust | Between Groups | 0.158 | 2 | 0.079 | 155.809 | 0.075 |

Within Groups | 1.944 | 3827 | 0.001 | |||

Total | 2.102 | 3829 | ||||

Valence | Between Groups | 57.741 | 2 | 28.871 | 1928.289 | 0.502 |

Within Groups | 57.299 | 3827 | 0.015 | |||

Total | 115.040 | 3829 | ||||

Arousal | Between Groups | 37.885 | 2 | 18.943 | 1987.085 | 0.509 |

Within Groups | 36.482 | 3827 | 0.010 | |||

Total | 74.368 | 3829 | ||||

Polarity | Between Groups | 1.480 | 2 | 0.740 | 41.546 | 0.021 |

Within Groups | 68.180 | 3827 | 0.018 | |||

Total | 69.661 | 3829 |

## References

- Chi, M.T.; Wylie, R. The ICAP framework: Linking cognitive engagement to active learning outcomes. Educ. Psychol.
**2014**, 49, 219–243. [Google Scholar] [CrossRef] - Menekse, M.; Chi, M.T. The role of collaborative interactions versus individual construction on students’ learning of engineering concepts. Eur. J. Eng. Educ.
**2018**, 44, 702–725. [Google Scholar] [CrossRef] - Roscoe, R.D.; Gutierrez, P.J.; Wylie, R.; Chi, M.T. Evaluating Lesson Design and Implementation within the ICAP Framework; International Society of the Learning Sciences: Boulder, CO, USA, 2014. [Google Scholar]
- D’Angelo, S.; Gergle, D. Gazed and confused: Understanding and designing shared gaze for remote collaboration. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, San Jose, CA, USA, 7–12 May 2016; pp. 2492–2496. [Google Scholar]
- Bizzell, P. Academic Discourse and Critical Consciousness; University of Pittsburgh: Pittsburgh, PA, USA, 1992. [Google Scholar]
- Hyland, K. Academic discourse. In Continuum Companion to Discourse Analysis; Bloomsbury Publishing: London, UK, 2011; pp. 171–184. [Google Scholar]
- Mauranen, A. A rich domain of ELF-the ELFA corpus of academic discourse. Nord. J. Engl. Stud.
**2006**, 5, 145–159. [Google Scholar] [CrossRef] - Liebman, N.; Gergle, D. Capturing turn-by-turn lexical similarity in text-based communication. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, San Francisco, CA, USA, 27 February–2 March 2016; pp. 553–559. [Google Scholar]
- Palloff, R.M.; Pratt, K. Building Online Learning Communities: Effective Strategies for the Virtual Classroom; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
- Chen PS, D.; Lambert, A.D.; Guidry, K.R. Engaging online learners: The impact of Web-based learning technology on college student engagement. Comput. Educ.
**2010**, 54, 1222–1232. [Google Scholar] [CrossRef] - Hernández-Lara, A.B.; Serradell-López, E. Student interactions in online discussion forums: Their perception on learning with business simulation games. Behav. Inf. Technol.
**2018**, 37, 419–429. [Google Scholar] [CrossRef] - Hussain, M.; Zhu, W.; Zhang, W.; Abidi, S.M.R. Student engagement predictions in an e-learning system and their impact on student course assessment scores. Comput. Intell. Neurosci.
**2018**, 2018, 6347186. [Google Scholar] [CrossRef] - Romero, C.; López, M.I.; Luna, J.M.; Ventura, S. Predicting students’ final performance from participation in on-line discussion forums. Comput. Educ.
**2013**, 68, 458–472. [Google Scholar] [CrossRef] - Yukselturk, E. An investigation of factors affecting student participation level in an online discussion forum. Turk. Online J. Educ. Technol.-TOJET
**2010**, 9, 24–32. [Google Scholar] - D’Mello, S.K.; Graesser, A. Language and discourse are powerful signals of student emotions during tutoring. IEEE Trans. Learn. Technol.
**2012**, 5, 304–317. [Google Scholar] [CrossRef] - Garrod, S.; Pickering, M.J. Why is conversation so easy? Trends Cogn. Sci.
**2004**, 8, 8–11. [Google Scholar] [CrossRef] - Gonzales, A.L.; Hancock, J.T.; Pennebaker, J.W. Language style matching as a predictor of social dynamics in small groups. Commun. Res.
**2010**, 37, 3–19. [Google Scholar] [CrossRef] - Garrod, S.; Anderson, A. Saying what you mean in dialogue: A study in conceptual and semantic co-ordination. Cognition
**1987**, 27, 181–218. [Google Scholar] [CrossRef] - Brennan, S.E. Lexical entrainment in spontaneous dialog. Proc. ISSD
**1996**, 96, 41–44. [Google Scholar] - Scissors, L.E.; Gill, A.J.; Geraghty, K.; Gergle, D. In CMC we trust: The role of similarity. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Boston, MA, USA, 4–9 April 2009; pp. 527–536. [Google Scholar]
- Scissors, L.E.; Gill, A.J.; Gergle, D. Linguistic mimicry and trust in text-based CMC. In Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, San Diego, CA, USA, 8–12 November 2008; pp. 277–280. [Google Scholar]
- Friedberg, H.; Litman, D.; Paletz, S.B. Lexical entrainment and success in student engineering groups. In Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA, 2–5 December 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 404–409. [Google Scholar]
- Liu, Y.; Li, A.; Dang, J.; Zhou, D. Semantic and Acoustic-Prosodic Entrainment of Dialogues in Service Scenarios. In Proceedings of the Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, 18–22 October 2021; pp. 71–74. [Google Scholar]
- Lin, D. An information-theoretic definition of similarity. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), Madison, WI, USA, 24–27 July 1998; Volume 98, pp. 296–304. [Google Scholar]
- Princeton University. “About WordNet.” WordNet; Princeton University: Princeton, NJ, USA, 2010. [Google Scholar]
- Pawar, A.; Mago, V. Challenging the boundaries of unsupervised learning for semantic similarity. IEEE Access
**2019**, 7, 16291–16308. [Google Scholar] [CrossRef] - McNamara, D.S. Computational methods to extract meaning from text and advance theories of human cognition. Top. Cogn. Sci.
**2011**, 3, 3–17. [Google Scholar] [CrossRef] - Banawan, M.; Shin, J.; Balyan, R.; Leite, W.L.; McNamara, D.S. Math Discourse Linguistic Components (Cohesive Cues within a Math Discussion Board Discourse). In Proceedings of the Ninth ACM Conference on Learning@ Scale, New York, NY, USA, 1–3 June 2022; pp. 389–394. [Google Scholar]
- Greiner-Petter, A.; Youssef, A.; Ruas, T.; Miller, B.R.; Schubotz, M.; Aizawa, A.; Gipp, B. Math-word embedding in math search and semantic extraction. Scientometrics
**2020**, 125, 3017–3046. [Google Scholar] [CrossRef] - Jo, H.; Kang, D.; Head, A.; Hearst, M.A. Modeling Mathematical Notation Semantics in Academic Papers. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, 16–20 November 2021; pp. 3102–3115. [Google Scholar]
- Ferreira, D.; Freitas, A. Premise selection in natural language mathematical texts. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 6–8 July 2020; pp. 7365–7374. [Google Scholar]
- Patel, A.; Bhattamishra, S.; Goyal, N. Are NLP Models really able to Solve Simple Math Word Problems? arXiv
**2021**, arXiv:2103.07191. [Google Scholar] - Algebra Nation. Available online: https://lastinger.center.ufl.edu/mathematics/algebra-nation/ (accessed on 22 January 2021).
- Leite, W.L.; Jing, Z.; Kuang, H.; Kim, D.; Huggins-Manley, A.C. Multilevel Mixture Modeling with Propensity Score Weights for Quasi-Experimental Evaluation of Virtual Learning Environments. Struct. Equ. Model. A Multidiscip. J.
**2021**, 28, 964–982. [Google Scholar] [CrossRef] - Leite, W.L.; Cetin-Berber, D.D.; Huggins-Manley, A.C.; Collier, Z.K.; Beal, C.R. The relationship between Algebra Nation usage and high-stakes test performance for struggling students. J. Comput. Assist. Learn.
**2019**, 35, 569–581. [Google Scholar] [CrossRef] - Honnibal, M.; Montani, I. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. Appear
**2017**, 7, 411–420. [Google Scholar] - Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Available online: https://github.com/MartinoMensio/spacy-universal-sentence-encoder-tfhub (accessed on 28 October 2022).
- Available online: https://tfhub.dev/google/universal-sentence-encoder/4 (accessed on 28 October 2022).
- Hutto, C.; Gilbert, E. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media, Ann Arbor, MI, USA, 1–4 June 2014; Volume 8, pp. 216–225. [Google Scholar]
- Warriner, A.B.; Kuperman, V.; Brysbaert, M. Norms of valence, arousal, and dominance for 13,915 English lemmas. Behav. Res. Methods
**2013**, 45, 1191–1207. [Google Scholar] [CrossRef] [PubMed] - Mohammad, S.M.; Turney, P.D. Crowdsourcing a word–emotion association lexicon. Comput. Intell.
**2013**, 29, 436–465. [Google Scholar] [CrossRef] - Mainz, N.; Shao, Z.; Brysbaert, M.; Meyer, A.S. Vocabulary knowledge predicts lexical processing: Evidence from a group of participants with diverse educational backgrounds. Front. Psychol.
**2017**, 8, 1164. [Google Scholar] [CrossRef] [PubMed] - Yap, M.J.; Balota, D.A.; Sibley, D.E.; Ratcliff, R. Individual differences in visual word recognition: Insights from the English Lexicon Project. J. Exp. Psychol. Hum. Percept. Perform.
**2012**, 38, 53. [Google Scholar] [CrossRef] [PubMed] - Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
- Gelbard, R.; Goldman, O.; Spiegler, I. Investigating diversity of clustering methods: An empirical comparison. Data Knowl. Eng.
**2007**, 63, 155–166. [Google Scholar] [CrossRef] - Benassi, M.; Garofalo, S.; Ambrosini, F.; Sant’Angelo, R.P.; Raggini, R.; De Paoli, G.; Ravani, C.; Giovagnoli, S.; Orsoni, M.; Piraccini, G. Using two-step cluster analysis and latent class cluster analysis to classify the cognitive heterogeneity of cross-diagnostic psychiatric inpatients. Front. Psychol.
**2020**, 11, 1085. [Google Scholar] [CrossRef] - Paxton, A.; Roche, J.M.; Ibarra, A.; Tanenhaus, M.K. Failure to (mis) communicate: Linguistic convergence, lexical choice, and communicative success in dyadic problem solving. In Proceedings of the Annual Meeting of the Cognitive Science Society, Quebec City, QC, Canada, 23–26 July 2014; Volume 36. [Google Scholar]
- Tosi, A. Adjusting Linguistically to Others: The Role of Social Context in Lexical Choices and Spatial Language. Ph.D. Thesis, The University of Edinburgh, Edinburgh, UK, 2017. [Google Scholar]
- Lapadat, J. Discourse devices used to establish community, increase coherence, and negotiate agreement in an online university course. Int. J. E-Learn. Distance Educ. Rev. Int. E-Learn. Form. À Distance
**2007**, 21, 59–92. [Google Scholar] - Lavasani, M.G.; Khandan, F. The effect of cooperative learning on mathematics anxiety and help seeking behavior. Procedia-Soc. Behav. Sci.
**2011**, 15, 271–276. [Google Scholar] [CrossRef] - Qayyum, A. Student help-seeking attitudes and behaviors in a digital era. Int. J. Educ. Technol. High. Educ.
**2018**, 15, 17. [Google Scholar] [CrossRef] - Dadure, P.; Pakray, P.; Bandyopadhyay, S. Mathematical Information Retrieval Trends and Techniques. In Deep Natural Language Processing and AI Applications for Industry 5.0; IGI Global: Hershey, PA, USA, 2021; pp. 74–92. [Google Scholar]

Feature | X^{2} | Feature | X^{2} |
---|---|---|---|

square | 3.20 | make sure | 2.48 |

square root | 3.12 | graph | 2.47 |

pemdas | 3.07 | coordinates | 2.35 |

alright | 2.97 | equation | 2.21 |

hello | 2.55 | subtracting sides | 1.98 |

Post | Where can I find help with properties of exponents? |

Reply | Section 1 topic 2 |

Reply | You can find it by pressing Section 1 and then pressing the video of topic 2. |

Reply | Hi, you can look at the 8th grade section and it should be section 8.7 of that |

Reply | Those are all about exponents |

Reply | you can refer to section 1 topic 2 |

Reply | Section 1 topic 2,4,5,6 could help. |

Most Informative Words | |
---|---|

Class: Not Similar | Class: Similar |

‘section’, −0.0589 | ‘grade’, 0.0251 |

‘topic’, −0.0397 | ‘help’, 0.0211 |

‘property’, −0.0219 | ‘exponent’, 0.0157 |

Post | The police department is having a bake sale. Donuts cost $1.50 each and cinnamon roll costs $2.50 each. The department uses the algebraic expression 1.50 |

Reply | The question isn’t finished... Can you put the rest? |

Reply | The police department is having a bake sale. Donuts cost $1.50 each and cinnamon rolls cost $2.50 each. The department uses the algebraic expression 1.50 |

Reply | a. What does the x variable represent?b. What does the y variable represent?c. A family buys 3 donuts and 4 cinnamon rolls. What are their total expenses? |

Reply | The department uses the algebraic expression 1.50 |

Most Informative Words | |
---|---|

Class: Not Similar | Class: Similar |

‘expression’, −0.06372 | ‘having’, 0.0129 |

‘represent’, −0.03780 | ‘question’, 0.0060 |

‘variable’, −0.0193 |

Post | where can i find a section on integers? |

Reply | What do you mean by integers? Rational and Irrational? |

Reply | ex. −3 + −4 = −7 |

Reply | I think that deals with rational and irrational numbers, section 1 topic 3 |

Reply | ok thanks |

Reply | No problem, glad that I could help |

Most Informative Words | |
---|---|

Class: Not Similar | Class: Similar |

‘section’, −0.0360 | ‘glad’, 0.0365 |

‘thank’, −0.0290 | ‘help’, 0.0315 |

‘mean’, −0.0202 | ‘integer’, 0.0263 |

‘problem’, −0.0160 | ‘Irrational’, 0.0195 |

‘number’, 0.0186 |

Mean | SD | Min | Max | |
---|---|---|---|---|

Similarity | 0.1777 | 0.0915 | −0.0520 | 0.7618 |

Trust | 0.1617 | 0.0200 | 0 | 0.1622 |

Valence * | 0.1482 | 0.1470 | 0 | 1.1824 |

Arousal * | 0.1231 | 0.1199 | 0 | 0.9200 |

Polarity | 0.1212 | 0.1259 | −0.5303 | 0.8530 |

Cluster | N | % of Total |
---|---|---|

1 | 974 | 24.4 |

2 | 461 | 12.0 |

3 | 2395 | 62.5 |

Total | 3830 |

Cluster | Similarity M (SD) | Trust M (SD) | Valence M (SD) | Arousal M (SD) | Polarity M (SD) |
---|---|---|---|---|---|

1 | 0.2944 (0.1496) | 0.0187 (0.0267) | 0.0430 (0.0725) | 0.0363 (0.0620) | 0.0143 (0.0206) |

2 | 0.2259 (0.1486) | 0.0330 (0.0438) | 0.4642 (0.2690) | 0.3780 (0.2073) | 0.0151 (0.0173) |

3 | 0.1539 (0.0495) | 0.0130 (0.0124) | 0.1288 (0.0888) | 0.1073 (0.0736) | 0.0110 (0.0007) |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Banawan, M.P.; Shin, J.; Arner, T.; Balyan, R.; Leite, W.L.; McNamara, D.S.
Shared Language: Linguistic Similarity in an Algebra Discussion Forum. *Computers* **2023**, *12*, 53.
https://doi.org/10.3390/computers12030053

**AMA Style**

Banawan MP, Shin J, Arner T, Balyan R, Leite WL, McNamara DS.
Shared Language: Linguistic Similarity in an Algebra Discussion Forum. *Computers*. 2023; 12(3):53.
https://doi.org/10.3390/computers12030053

**Chicago/Turabian Style**

Banawan, Michelle P., Jinnie Shin, Tracy Arner, Renu Balyan, Walter L. Leite, and Danielle S. McNamara.
2023. "Shared Language: Linguistic Similarity in an Algebra Discussion Forum" *Computers* 12, no. 3: 53.
https://doi.org/10.3390/computers12030053