Evaluation and Explanation of Post Quality Based on a Multimodal, Multilevel, and Multi-Scope Focused Fusion Mechanism
Abstract
:1. Introduction
2. Related Work
2.1. Related Research on the Post
2.2. Multimodal Fusion
3. Proposed Method
3.1. Overview
3.2. Feature Encoder
3.2.1. Text Encoder
3.2.2. Image Encoder
3.3. Multi-Level Multimodal Attention-Based Fusion
3.3.1. Fusion Based on Key Information from Posts
3.3.2. Fusion Based on Key Information from Topic
3.3.3. Emphasis Fusion Based on Bimodality
3.3.4. Emphasis-Based Fusion for Tri-Modal Data
3.4. Multimodal Decoder
4. Experiment, Results and Analysis
4.1. Dataset
4.2. Experimental Setup
4.3. Model Comparison
4.4. Model Comparison in Public Dataset
4.5. Ablation Study
4.6. Case Study
5. Conclusions
6. Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ntourmas, A.; Daskalaki, S.; Dimitriadis, Y.; Avouris, N. Classifying MOOC forum posts using corpora semantic similarities: A study on transferability across different courses. Neural Comput. Appl. 2023, 35, 161–175. [Google Scholar] [CrossRef]
- El-Rashidy, M.A.; Khodeir, N.A.; Farouk, A.; Aslan, H.K.; El-Fishawy, N.A. Attention-based contextual local and global features for urgent posts classification in MOOCs discussion forums. Ain Shams Eng. J. 2024, 15, 102605. [Google Scholar] [CrossRef]
- Urvi, S.; Rambhia, R.; Kothari, P.; Ramesh, R.; Banerjee, G. Automatic Classification of MOOC Forum Messages to Measure the Quality of Peer Interaction. In Proceedings of the International Conference on Computers in Education, Online, 22–26 November 2021. [Google Scholar]
- Yang, B.; Tang, H.; Hao, L.; Rose, J.R. Untangling chaos in discussion forums: A temporal analysis of topic-relevant forum posts in MOOCs. Comput. Educ. 2022, 178, 104402. [Google Scholar] [CrossRef]
- Huang, C.Q.; Han, Z.M.; Li, M.X.; Jong, M.S.y.; Tsai, C.C. Investigating students’ interaction patterns and dynamic learning sentiments in online discussions. Comput. Educ. 2019, 140, 103589. [Google Scholar] [CrossRef]
- Zou, W.; Hu, X.; Pan, Z.; Li, C.; Cai, Y.; Liu, M. Exploring the relationship between social presence and learners’ prestige in MOOC discussion forums using automated content analysis and social network analysis. Comput. Hum. Behav. 2021, 115, 106582. [Google Scholar] [CrossRef]
- Tao, X.; Shannon-Honson, A.; Delaney, P.; Dann, C.; Xie, H.; Li, Y.; O’Neill, S. Towards an understanding of the engagement and emotional behaviour of MOOC students using sentiment and semantic features. Comput. Educ. Artif. Intell. 2023, 4, 100116. [Google Scholar] [CrossRef]
- Liu, Z.; Kong, X.; Chen, H.; Liu, S.; Yang, Z. MOOC-BERT: Automatically identifying learner cognitive presence from MOOC discussion data. IEEE Trans. Learn. Technol. 2023, 16, 528–542. [Google Scholar] [CrossRef]
- Liu, S.; Liu, S.; Liu, Z.; Peng, X.; Yang, Z. Automated detection of emotional and cognitive engagement in MOOC discussions to predict learning achievement. Comput. Educ. 2022, 181, 104461. [Google Scholar] [CrossRef]
- Munigadiapa, P.; Adilakshmi, T. MOOC-LSTM: The LSTM architecture for sentiment analysis on MOOCs forum posts. In Computational Intelligence and Data Analytics: Proceedings of ICCIDA 2022; Springer: Singapore, 2022; pp. 283–293. [Google Scholar]
- del Águila Escobar, R.A.; Suárez-Figueroa, M.C.; Fernández-López, M. OBOE: An Explainable Text Classification Framework. Int. J. Interact. Multimed. Artif. Intell. 2024, 8, 24–37. [Google Scholar] [CrossRef]
- Ouhaichi, H.; Spikol, D.; Vogel, B. Research trends in multimodal learning analytics: A systematic mapping study. Comput. Educ. Artif. Intell. 2023, 4, 100136. [Google Scholar] [CrossRef]
- Wu, D.; Chen, J.; Deng, W.; Wei, Y.; Luo, H.; Wei, Y. The recognition of teacher behavior based on multimodal information fusion. Math. Probl. Eng. 2020, 2020, 8269683. [Google Scholar] [CrossRef]
- Bhattacharjee, S.D.; Gokaraju, J.S.A.V.; Yuan, J.; Kalwa, A. Multi-view knowledge graph for explainable course content recommendation in course discussion posts. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2785–2791. [Google Scholar]
- Li, M.; Zhuang, X.; Bai, L.; Ding, W. Multimodal graph learning based on 3D Haar semi-tight framelet for student engagement prediction. Inf. Fusion 2024, 105, 102224. [Google Scholar] [CrossRef]
- Verma, N.; Getenet, S.; Dann, C.; Shaik, T. Designing an artificial intelligence tool to understand student engagement based on teacher’s behaviours and movements in video conferencing. Comput. Educ. Artif. Intell. 2023, 5, 100187. [Google Scholar] [CrossRef]
- Wang, Y.; He, J.; Wang, D.; Wang, Q.; Wan, B.; Luo, X. Multimodal transformer with adaptive modality weighting for multimodal sentiment analysis. Neurocomputing 2024, 572, 127181. [Google Scholar] [CrossRef]
- Zhao, G.; Zhang, Y.; Chu, J. A multimodal teacher speech emotion recognition method in the smart classroom. Internet Things 2024, 25, 101069. [Google Scholar] [CrossRef]
- Nandi, A.; Xhafa, F.; Subirats, L.; Fort, S. A survey on multimodal data stream mining for e-learner’s emotion recognition. In Proceedings of the 2020 International Conference on Omni-Layer Intelligent Systems (COINS), Barcelona, Spain, 31 August–2 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Liu, Z.; Kong, W.; Peng, X.; Yang, Z.; Liu, S.; Liu, S.; Wen, C. Dual-feature-embeddings-based semi-supervised learning for cognitive engagement classification in online course discussions. Knowl.-Based Syst. 2023, 259, 110053. [Google Scholar] [CrossRef]
- Song, L.; Yu, M.; Shang, X.; Lu, Y.; Liu, J.; Zhang, Y.; Li, Z. A deep grouping fusion neural network for multimedia content understanding. IET Image Process. 2022, 16, 2398–2411. [Google Scholar] [CrossRef]
- Ding, X.; Han, T.; Fang, Y.; Larson, E. An approach for combining multimodal fusion and neural architecture search applied to knowledge tracing. Appl. Intell. 2023, 53, 11092–11103. [Google Scholar] [CrossRef]
- Chango, W.; Silva, G.; Sanchez, H.; Logrono, S. Predicting academic performance of undergraduate students in blended learning. In Proceedings of the 2024 IEEE Colombian Conference on Communications and Computing (COLCOM), Dubai, United Arab Emirates, 2–5 December 2019; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6. [Google Scholar]
- Wei, T.; Chen, D.; Zhou, W.; Liao, J.; Tan, Z.; Yuan, L.; Zhang, W.; Yu, N. Hairclip: Design your hair by text and reference image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 18072–18081. [Google Scholar]
- Sammani, F.; Mukherjee, T.; Deligiannis, N. Nlx-gpt: A model for natural language explanations in vision and vision-language tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 8322–8332. [Google Scholar]
- Song, J.; Chen, H.; Li, C.; Xie, K. MIFM: Multimodal Information Fusion Model for Educational Exercises. Electronics 2023, 12, 3909. [Google Scholar] [CrossRef]
- Yang, B.; Li, J.; Wong, D.F.; Chao, L.S.; Wang, X.; Tu, Z. Context-aware self-attention networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 387–394. [Google Scholar]
- See, A.; Liu, P.J.; Manning, C.D. Get to the point: Summarization with pointer-generator networks. arXiv 2017, arXiv:1704.04368. [Google Scholar]
- Lewis, M. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Kumar, S.; Kulkarni, A.; Akhtar, M.S.; Chakraborty, T. When did you become so smart, oh wise one?! Sarcasm explanation in multi-modal multi-party dialogues. arXiv 2022, arXiv:2203.06419. [Google Scholar]
- Desai, P.; Chakraborty, T.; Akhtar, M.S. Nice perfume. How long did you marinate in it? Multimodal sarcasm explanation. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event, 22 February–1 March 2022; Volume 36, pp. 10563–10571. [Google Scholar]
- Jing, L.; Song, X.; Ouyang, K.; Jia, M.; Nie, L. Multi-source semantic graph-based multimodal sarcasm explanation generation. arXiv 2023, arXiv:2306.16650. [Google Scholar]
Datasets | Instance Number (#T/#P) | |
---|---|---|
Train + Dev | Test | |
The Art Literature | 183/3366 | 41/100 |
The Education Teaching | 264/9591 | 121/2480 |
Tpye | Method | The Art Literature | |||||||
---|---|---|---|---|---|---|---|---|---|
ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | METEOR | ||
Text-only | PGN | 40.56 | 19.19 | 32.01 | 36.93 | 25.46 | 19.24 | 15.38 | 36.41 |
BRAT | 44.35 | 23.89 | 35.97 | 39.44 | 28.24 | 22.33 | 17.71 | 35.62 | |
Multimodal | 49.28 | 27.39 | 39.72 | 44.46 | 32.57 | 25.36 | 20.03 | 40.99 | |
ExMore | 51.19 | 31.73 | 43.45 | 38.71 | 29.91 | 24.92 | 21.12 | 41.16 | |
TEAM | 54.55 | 32.35 | 42.58 | 46.93 | 36.54 | 30.47 | 26.03 | 42.45 | |
Our model | 58.32 | 42.67 | 53.54 | 55.22 | 48.47 | 43.80 | 39.75 | 52.77 |
Type | Method | Education and Teaching | |||||||
---|---|---|---|---|---|---|---|---|---|
ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | METEOR | ||
Text-only | PGN | 49.89 | 30.10 | 43.25 | 46.97 | 35.78 | 29.35 | 24.82 | 45.99 |
BRAT | 52.69 | 33.11 | 45.92 | 49.10 | 38.51 | 31.72 | 26.73 | 46.41 | |
Multimodal | 56.54 | 38.76 | 50.29 | 53.53 | 43.79 | 37.33 | 32.53 | 50.77 | |
ExMore | 62.62 | 43.78 | 55.47 | 50.88 | 42.48 | 37.08 | 32.97 | 51.54 | |
TEAM | 64.99 | 46.77 | 56.85 | 59.70 | 50.91 | 45.37 | 41.13 | 55.64 | |
Our model | 67.70 | 51.75 | 61.58 | 65.47 | 58.47 | 53.86 | 49.71 | 63.56 |
Tpye | Method | The MORE Dataset | |||||||
---|---|---|---|---|---|---|---|---|---|
ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | METEOR | ||
Text-only | PGN | 17.35 | 6.90 | 16.00 | 17.54 | 6.31 | 2.33 | 1.67 | 15.06 |
Transformer | 17.78 | 5.83 | 15.90 | 11.44 | 4.79 | 1.68 | 0.73 | 9.74 | |
Multimodal | M-Transf | 20.99 | 6.98 | 18.77 | 14.37 | 6.48 | 2.94 | 1.57 | 12.84 |
ExMore | 27.55 | 12.49 | 25.23 | 19.26 | 11.21 | 6.56 | 4.26 | 19.16 | |
TEAM | 51.72 | 34.96 | 50.58 | 55.32 | 45.12 | 38.27 | 33.16 | 50.95 | |
Our model | 52.37 | 36.84 | 52.41 | 56.19 | 46.86 | 39.13 | 34.24 | 51.93 |
Model Variant | Art Literature | |||||||
---|---|---|---|---|---|---|---|---|
ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | METEOR | |
Our model | 58.32 | 42.67 | 53.54 | 55.22 | 48.47 | 43.80 | 39.75 | 52.77 |
-w/o KIFBT | 57.19 | 41.67 | 52.76 | 55.17 | 48.06 | 43.10 | 38.99 | 52.37 |
-w/o KIFBP | 57.42 | 41.16 | 52.43 | 55.12 | 47.96 | 43.04 | 38.78 | 52.25 |
-w/o BFF(I) | 47.01 | 30.84 | 41.19 | 48.34 | 39.70 | 34.61 | 30.81 | 44.71 |
-w/o BFF(II) | 57.55 | 42.13 | 53.39 | 54.86 | 48.02 | 43.33 | 39.39 | 51.95 |
-w/o BFF(III) | 57.99 | 41.67 | 53.58 | 56.29 | 48.79 | 43.71 | 39.55 | 53.01 |
-w/o BFF(IV) | 54.80 | 40.00 | 50.14 | 54.48 | 47.35 | 42.68 | 38.71 | 51.40 |
-w/o FFBT(I) | 56.60 | 41.01 | 52.59 | 54.69 | 47.53 | 42.68 | 38.61 | 51.29 |
-w/o FFBT(II) | 53.90 | 37.98 | 48.52 | 51.61 | 44.90 | 40.64 | 37.08 | 48.11 |
-w/o FFBT(III) | 54.90 | 37.04 | 49.13 | 49.95 | 42.58 | 37.54 | 33.29 | 47.30 |
Model Variant | Education Teaching | |||||||
---|---|---|---|---|---|---|---|---|
ROUGE-1 | ROUGE-2 | ROUGE-L | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | METEOR | |
Our model | 67.70 | 51.75 | 61.58 | 65.47 | 58.47 | 53.86 | 49.71 | 63.56 |
-w/o KIFBT | 63.58 | 45.60 | 55.78 | 61.27 | 53.21 | 48.10 | 43.60 | 58.88 |
-w/o KIFBP | 63.63 | 45.12 | 55.92 | 61.57 | 53.18 | 47.95 | 43.44 | 58.42 |
-w/o BFF(I) | 62.04 | 44.04 | 54.45 | 59.97 | 51.78 | 46.62 | 42.10 | 57.62 |
-w/o BFF(II) | 43.42 | 25.59 | 37.46 | 39.32 | 31.24 | 26.82 | 23.13 | 46.37 |
-w/o BFF(III) | 62.77 | 44.89 | 55.99 | 60.71 | 52.66 | 47.54 | 42.97 | 58.05 |
-w/o BFF(IV) | 52.70 | 35.95 | 46.85 | 50.14 | 42.13 | 37.60 | 33.66 | 50.26 |
-w/o FFBT(I) | 53.58 | 36.37 | 46.77 | 53.28 | 44.67 | 39.85 | 35.82 | 48.93 |
-w/o FFBT(II) | 63.90 | 47.34 | 57.68 | 62.60 | 55.08 | 50.16 | 45.86 | 59.79 |
-w/o FFBT(III) | 66.59 | 50.07 | 60.06 | 63.96 | 56.63 | 51.83 | 47.54 | 62.79 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Guo, X.; Cao, H.; Cui, Y.; Zhao, H. Evaluation and Explanation of Post Quality Based on a Multimodal, Multilevel, and Multi-Scope Focused Fusion Mechanism. Electronics 2025, 14, 656. https://doi.org/10.3390/electronics14040656
Guo X, Cao H, Cui Y, Zhao H. Evaluation and Explanation of Post Quality Based on a Multimodal, Multilevel, and Multi-Scope Focused Fusion Mechanism. Electronics. 2025; 14(4):656. https://doi.org/10.3390/electronics14040656
Chicago/Turabian StyleGuo, Xiaoxu, Han Cao, Yachao Cui, and Haiyan Zhao. 2025. "Evaluation and Explanation of Post Quality Based on a Multimodal, Multilevel, and Multi-Scope Focused Fusion Mechanism" Electronics 14, no. 4: 656. https://doi.org/10.3390/electronics14040656
APA StyleGuo, X., Cao, H., Cui, Y., & Zhao, H. (2025). Evaluation and Explanation of Post Quality Based on a Multimodal, Multilevel, and Multi-Scope Focused Fusion Mechanism. Electronics, 14(4), 656. https://doi.org/10.3390/electronics14040656