Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts
Abstract
:1. Introduction
2. Related Work
2.1. Sentiment Classification of MOOC Forum Posts
2.2. Co-Training for Semi-Supervised Learning
2.3. Loss Function
3. Method
3.1. Overview
3.2. Embedding Layers from the Two Views
3.2.1. GN-Embedding Layer
3.2.2. ELMo-Embedding Layer
3.3. Deep Neural Network
3.3.1. Convolutional Layer
3.3.2. Max-Pooling Layer
3.3.3. Softmax Layer
3.4. Double-Check Strategy Sample Selection
3.5. A Mixed Loss Function
4. Experimental Setup
4.1. Dataset
4.2. Comparison
4.3. Parameter Settings
4.4. Model Evaluation
5. Experimental Results
5.1. Overall Performance
5.1.1. Comparison Results of the Three Groups
5.1.2. Impact of Percentage of Initial Labeled Data
5.1.3. Comparison Results of Different Classifiers
5.2. Impact of the Two Views
5.2.1. Comparison Results between the Two Views and a Single View
5.2.2. Comparison Results between GN and ELMo
5.3. Impact of the Double-Check Strategy
5.3.1. Comparison of the Double-Check Strategy and a Single Strategy
5.3.2. Details of the Double-Check Strategy Sample Selection
5.4. Impact of the Mixed Loss Function
5.4.1. Parameter Settings of the Focal Loss
5.4.2. Impact of the Focal Loss
5.4.3. Impact of the Mixed Loss Function for Semi-Supervised Learning
6. Discussion
7. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Class Central. Available online: https://www.classcentral.com/moocs-year-in-review-2013 (accessed on 7 May 2019).
- Class Central. Available online: https://www.classcentral.com/report/mooc-stats-2018 (accessed on 7 May 2019).
- Tsironis, A.; Katsanos, C.; Xenos, M. Comparative usability evaluation of three popular MOOC platforms. In Proceedings of the 2016 IEEE Global Engineering Education Conference (EDUCON), Abu Dhabi, UAE, 10–13 April 2016; pp. 608–612. [Google Scholar]
- Rai, L.; Chunrao, D. Influencing factors of success and failure in MOOC and general analysis of learner behavior. Int. J. Inf. Educ. Technol. 2016, 6, 262–268. [Google Scholar] [CrossRef] [Green Version]
- Gil, R.; Virgili-Gomá, J.; García, R.; Mason, C. Emotions ontology for collaborative modelling and learning of emotional responses. Comput. Hum. Behav. 2015, 51, 610–617. [Google Scholar] [CrossRef] [Green Version]
- Cabada, R.Z.; Estrada, M.L.B.; Bustillos, R.O. Mining of Educational Opinions with Deep Learning. J. Univ. Comput. Sci. 2018, 24, 1604–1626. [Google Scholar]
- Ain, Q.T.; Ali, M.; Riaz, A.; Noureen, A.; Kamran, M.; Hayat, B.; Rehman, A. Sentiment Analysis Using Deep Learning Techniques: A Review. Int. J. Adv. Comput. Sci. Appl. 2017, 8, 424–433. [Google Scholar]
- Shapiro, H.B.; Lee, C.H.; Roth, N.E.W.; Li, K.; Çetinkaya-Rundel, M.; Canelas, D.A. Understanding the massive open online course (MOOC) student experience. Comput. Educ. 2017, 110, 35–50. [Google Scholar] [CrossRef]
- Zhang, L.; Wang, S.; Liu, B. Deep learning for sentiment analysis: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1253. [Google Scholar] [CrossRef] [Green Version]
- Severyn, A.; Moschitti, A. Twitter Sentiment Analysis with Deep Convolutional Neural Networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, 9–13 August 2016; pp. 959–962. [Google Scholar]
- Day, M.Y.; Lee, C.C. Deep learning for financial sentiment analysis on finance news providers. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Davis, CA, USA, 18–21 August 2016; pp. 1127–1134. [Google Scholar]
- Tang, D.; Qin, B.; Liu, T.; Yang, Y. User modeling with neural network for review rating prediction. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015; pp. 1340–1346. [Google Scholar]
- Wei, X.; Lin, H.; Yang, L.; Yu, Y. A Convolution-LSTM-Based Deep Neural Network for Cross-Domain MOOC Forum Post Classification. Information 2017, 8, 92. [Google Scholar] [CrossRef] [Green Version]
- Johnson, R.; Zhang, T. Semi-supervised convolutional neural networks for text categorization via region embedding. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 919–927. [Google Scholar]
- Zhou, Z.H.; Li, M. Semi-supervised learning by disagreement. Knowl. Inf. Syst. 2010, 24, 415–439. [Google Scholar] [CrossRef]
- SzymańSki, J. Comparative analysis of text representation methods using classification. Cybern. Syst. 2014, 45, 180–199. [Google Scholar] [CrossRef]
- Chen, X.; Xu, L.; Liu, Z.; Sun, M.; Luan, H. Joint learning of character and word embeddings. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015; pp. 1236–1242. [Google Scholar]
- Johnson, R.; Zhang, T. Supervised and semi-supervised text categorization using LSTM for region embeddings. arXiv 2016, arXiv:1602.02373. [Google Scholar]
- Miyato, T.; Dai, A.M.; Goodfellow, I. Adversarial training methods for semi-supervised text classification. arXiv 2016, arXiv:1605.07725. [Google Scholar]
- Wan, X. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, 2–7 August 2009; pp. 235–243. [Google Scholar]
- Li, S.; Wang, Z.; Zhou, G.; Lee, S.Y.M. Semi-supervised learning for imbalanced sentiment classification. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain, 19–22 July 2011; pp. 1826–1831. [Google Scholar]
- Blum, A.; Mitchell, T. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory, Madison, OH, USA, 24–26 July 1998; pp. 92–100. [Google Scholar]
- Katz, G.; Caragea, C.; Shabtai, A. Vertical Ensemble Co-Training for Text Classification. ACM Trans. Intell. Syst. Technol. 2018, 9, 21:1–21:23. [Google Scholar] [CrossRef]
- Zhou, Z.H.; Zhan, D.C.; Yang, Q. Semi-supervised learning with very few labeled training examples. In Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 22–26 July 2007; pp. 675–680. [Google Scholar]
- Zhang, M.L.; Zhou, Z.H. CoTrade: Confident co-training with data editing. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2011, 41, 1612–1626. [Google Scholar] [CrossRef] [PubMed]
- Goldman, S.; Zhou, Y. Enhancing supervised learning with unlabeled data. In Proceedings of the 17th International Conference on Machine Learning (ICML 2000), Stanford, CA, USA, 29 June–2 July 2000; pp. 327–334. [Google Scholar]
- Weston, J.; Ratle, F.; Mobahi, H.; Collobert, R. Deep learning via semi-supervised embedding. In Neural Networks: Tricks of the Trade, 2nd ed.; Grégoire, M., Genevieve, B.O., Klaus-Robert, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 639–655. [Google Scholar]
- Sachan, D.S.; Zaheer, M.; Salakhutdinov, R. Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 6940–6948. [Google Scholar]
- Miyato, T.; Maeda, S.I.; Koyama, M.; Ishii, S. Virtual adversarial training: A regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1979–1993. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mountassir, A.; Benbrahim, H.; Berrada, I. An empirical study to address the problem of unbalanced data sets in sentiment classification. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, SMC 2012, Seoul, Korea, 14–17 October 2012; pp. 3298–3303. [Google Scholar]
- Medhat, W.; Hassan, A.; Korashy, H. Sentiment analysis algorithms and applications: A survey. Ain Shams Eng. J. 2014, 5, 1093–1113. [Google Scholar] [CrossRef] [Green Version]
- Saif, H.; He, Y.; Fernandez, M.; Alani, H. Contextual semantics for sentiment analysis of Twitter. Inf. Process. Manag. 2016, 52, 5–19. [Google Scholar] [CrossRef] [Green Version]
- Kaewyong, P.; Sukprasert, A.; Salim, N.; Phang, F.A. The possibility of students’ comments automatic interpret using lexicon based sentiment analysis to teacher evaluation. In Proceedings of the 3rd International Conference on Artificial Intelligence and Computer Science (AICS2015), Penang, Malaysia, 12–13 October 2015; pp. 179–189. [Google Scholar]
- Wen, M.; Yang, D.; Rose, C. Sentiment Analysis in MOOC Discussion Forums: What does it tell us? In Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, London, UK, 4–7 July 2014; pp. 130–137. [Google Scholar]
- Neethu, M.S.; Rajasree, R. Sentiment analysis in twitter using machine learning techniques. In Proceedings of the 4th International Conference on Computing, Communications and Networking Technologies, Tiruchengode, India, 4–6 July 2013; pp. 1–5. [Google Scholar]
- Gamallo, P.; Garcia, M. Citius: A naive-bayes strategy for sentiment analysis on english tweets. In Proceedings of the 8th International Workshop on Semantic Evaluation, SemEval@COLING 2014, Dublin, Ireland, 23–24 August 2014; pp. 171–175. [Google Scholar]
- Nayak, A.; Natarajan, D. Comparative Study of Naive Bayes, Support Vector Machine and Random Forest Classifiers in Sentiment Analysis of Twitter Feeds. Int. J. Adv. Stud. Comput. Sci. Eng. 2016, 5, 14–17. [Google Scholar]
- Chen, M.; Weinberger, K.Q. An alternative text representation to TF-IDF and Bag-of-Words. arXiv 2013, arXiv:1301.6770. [Google Scholar]
- Nguyen, P.H.G.; Vo, C.T.N. A CNN Model with Data Imbalance Handling for Course-Level Student Prediction Based on Forum Texts. In Proceedings of the 10th International Conference on Computational Collective Intelligence, ICCCI 2018, Bristol, UK, 5–7 September 2018; pp. 479–490. [Google Scholar]
- Lee, K.; Qadir, A.; Hasan, S.A.; Datla, V.; Prakash, A.; Liu, J.; Farri, O. Adverse drug event detection in tweets with semi-supervised convolutional neural networks. In Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, 3–7 April 2017; pp. 705–714. [Google Scholar]
- Zhou, S.; Chen, Q.; Wang, X. Active deep networks for semi-supervised sentiment classification. In Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, 23–27 August 2010; pp. 1515–1523. [Google Scholar]
- Socher, R.; Pennington, J.; Huang, E.H.; Ng, A.Y.; Manning, C.D. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, UK, 27–31 July 2011; pp. 151–161. [Google Scholar]
- Caliskan, A.; Bryson, J.J.; Narayanan, A. Semantics derived automatically from language corpora contain human-like biases. Science 2017, 356, 183–186. [Google Scholar] [CrossRef] [Green Version]
- Kim, Y. Convolutional neural networks for sentence classification. arXiv 2014, arXiv:1408.5882. [Google Scholar]
- Santos, C.D.; Zadrozny, B. Learning character-level representations for part-of-speech tagging. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21–26 June 2014; pp. 1818–1826. [Google Scholar]
- Kim, Y.; Jernite, Y.; Sontag, D.; Rush, A.M. Character-aware neural language models. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 1818–1826. [Google Scholar]
- Peters, M.E.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep contextualized word representations. arXiv 2018, arXiv:1802.05365. [Google Scholar]
- Xia, R.; Wang, C.; Dai, X.Y.; Li, T. Co-training for semi-supervised sentiment classification based on dual-view bags-of-words representation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, Beijing, China, 26–31 July 2015; pp. 1054–1063. [Google Scholar]
- Zhou, Z.H. When semi-supervised learning meets ensemble learning. In Proceedings of the 8th Conference on Multiple Classifier Systems Workshops, MCS 2009, Reykjavik, Iceland, 10–12 June 2009; pp. 529–538. [Google Scholar]
- Hecking, T.; Hoppe, H.U.; Harrer, A. Uncovering the structure of knowledge exchange in a MOOC discussion forum. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015, Paris, France, 25–28 August 2015; pp. 1614–1615. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Grandvalet, Y.; Bengio, Y. Semi-supervised learning by entropy minimization. In Proceedings of the Advances in Neural Information Processing Systems, NIPS 2004, Vancouver, BC, Canada, 13–18 December 2004; pp. 529–536. [Google Scholar]
- Vu, T.H.; Jain, H.; Bucher, M.; Cord, M.; Pérez, P. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019; pp. 2517–2526. [Google Scholar]
- Yin, W.; Kann, K.; Yu, M. Comparative study of CNN and RNN for natural language processing. arXiv 2017, arXiv:1702.01923. [Google Scholar]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Webb, G.I.; Zheng, Z. Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Trans. Knowl. Data Eng. 2004, 16, 980–991. [Google Scholar] [CrossRef] [Green Version]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Dynamic meta-embeddings for improved sentence representations. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- The Stanford MOOCPosts Data Set. Available online: https://www.classcentral.com/moocs-year-in-review-2013 (accessed on 7 May 2019).
- Bakharia, A. Towards cross-domain mooc forum post classification. In Proceedings of the 3rd ACM Conference on Learning @ Scale, L@S 2016, Edinburgh, Scotland, UK, 25–26 April 2016; pp. 1908–1912. [Google Scholar]
- Sajjadi, M.; Javanmardi, M.; Tasdizen, T. Mutual exclusivity loss for semi-supervised deep learning. In Proceedings of the 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, 25–28 September 2016; pp. 1908–1912. [Google Scholar]
- Santos, I.; Nedjah, N.; de Macedo Mourelle, L. Sentiment analysis using convolutional neural network with fastText embeddings. In Proceedings of the IEEE Latin American Conference on Computational Intelligence, LA-CCI 2017, Arequipa, Peru, 8–10 November 2017; pp. 1–5. [Google Scholar]
- Kiela, D.; Wang, C.; Cho, K. Dynamic meta-embeddings for improved sentence representations. arXiv 2018, arXiv:1804.07983. [Google Scholar]
- Fang, B.; Li, Y.; Zhang, H.; Chan, J.C.W. Hyperspectral Images Classification Based on Dense Convolutional Networks with Spectral-Wise Attention Mechanism. Remote Sens. 2019, 11, 159. [Google Scholar] [CrossRef] [Green Version]
- Lin, L.; Wang, K.; Meng, D.; Zuo, W.; Zhang, L. Active self-paced learning for cost-effective and progressive face identification. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 7–19. [Google Scholar] [CrossRef] [Green Version]
- Jiang, L.; Meng, D.; Yu, S.I.; Lan, Z.; Shan, S.; Hauptmann, A. Self-paced learning with diversity. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 2078–2086. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
Post | Score |
---|---|
I can’t think of a better way to end my holidays than to take this course. I feel my synapses sparkle and feel so inspired. I can’t wait to meet my classes again. | 7 |
I agree. This will not be automatic for kids until they’ve been shown how. | 4 |
Terrible interface design! Just put an obvious ’next’ button at the bottom of the main body area or clone the whole linear navigation from the top. | 1 |
Course | Size | Positive | Negative |
---|---|---|---|
Education How to Learn Math (Course1) | 9878 | 8188 | 1690 |
Humanities Science Stat Learning (Course2) | 3030 | 2834 | 196 |
Medicine Sci Write (Course3) | 5184 | 3907 | 1277 |
Method | Course1 | Course2 | Course3 | |||
---|---|---|---|---|---|---|
Traditional supervision | Accuracy | F1-score | Accuracy | F1-score | Accuracy | F1-score |
Random Forest | 83.89 | 89.04 | 92.99 | 94.62 | 79.1 | 81.87 |
SVM (RBF) | 84.92 | 90.57 | 93.12 | 95.27 | 81.23 | 82.35 |
Deep supervision | Accuracy | F1-score | Accuracy | F1-score | Accuracy | F1-score |
GN-CNN | 85.99 | 91.8 | 93.5 | 96.61 | 82.12 | 84.43 |
ELMo-CNN | 87.29 | 92.41 | 94.16 | 96.95 | 83.46 | 85.57 |
Deep supervision with FL | Accuracy | F1-score | Accuracy | F1-score | Accuracy | F1-score |
GN-CNN-FL | 86.36 | 91.97 | 93.83 | 96.79 | 83.57 | 85.41 |
ELMo-CNN-FL | 87.58 | 92.69 | 94.05 | 96.9 | 84.3 | 86.52 |
Ours | Accuracy | F1-score | Accuracy | F1-score | Accuracy | F1-score |
SSDL | 89.47 | 94.17 | 94.99 | 97.42 | 84.73 | 89.06 |
Iteration | Train Number | Test Number | Same Number | Augment Number | |
---|---|---|---|---|---|
1 | 987 | 8891 | 5756 | 576 | 0.4509 |
2 | 1563 | 8315 | 7317 | 1463 | 0.397 |
3 | 3028 | 6852 | 6372 | 1911 | 0.3624 |
4 | 4939 | 4941 | 4348 | 1739 | 0.3454 |
5 | 6678 | 3202 | 2689 | 1345 | 0.3318 |
6 | 8023 | 1857 | 1819 | 1091 | 0.2992 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, J.; Feng, J.; Sun, X.; Liu, Y. Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts. Symmetry 2020, 12, 8. https://doi.org/10.3390/sym12010008
Chen J, Feng J, Sun X, Liu Y. Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts. Symmetry. 2020; 12(1):8. https://doi.org/10.3390/sym12010008
Chicago/Turabian StyleChen, Jing, Jun Feng, Xia Sun, and Yang Liu. 2020. "Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts" Symmetry 12, no. 1: 8. https://doi.org/10.3390/sym12010008
APA StyleChen, J., Feng, J., Sun, X., & Liu, Y. (2020). Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts. Symmetry, 12(1), 8. https://doi.org/10.3390/sym12010008