“Opinion Target: Opinion Word” Pairs Extraction Based on CRF
Abstract
:1. Introduction
2. Related Work
2.1. Overview of the Methods of Opinion Targets Extraction
2.2. Conditional Random Field (CRF)
2.2.1. Principle
2.2.2. Research Situation
3. Methods
3.1. Data Cleaning
3.1.1. Word-Splits and Part of Speech
3.1.2. Mark Emotional Words
3.1.3. The Label Set
3.2. Feature Selection and Template Definition
3.2.1. Feature Selection
3.2.2. Template Definition
3.3. Train the Model of Opinion Targets Extraction
4. Experiments Results and Analysis
4.1. Dataset and Metrics
4.2. Analysis of Results
5. “Opinion Target: Opinion Word” Pair System Development
5.1. System Development Process
5.2. Front-End Design
5.3. Back-End Design
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Li, S.; Zhou, L.; Li, Y. Improving aspect extraction by augmenting a frequency-based method with web-based similarity measures. Inf. Process. Manag. 2015, 51, 58–67. [Google Scholar] [CrossRef]
- Ahmad Rana, T.; Cheah, Y.-N. A two-fold rule-based model for aspect extraction. Expert Syst. Appl. 2017, 89, 273–285. [Google Scholar] [CrossRef]
- Wang, J.; Peng, Y.; Lin, Y.; Wang, K. Template Based Industrial Big Data Information Extraction and Query System. Data Mining and Big Data-Second International Conference(DMBD). In International Conference on Data Mining and Big Data; Springer: Cham, Switzerland, 2017; pp. 247–254. [Google Scholar]
- Liu, Q.; Gao, Z.; Liu, B.; Zhang, Y. Automated Rule Selection for Aspect Extraction in Opinion Mining. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015; pp. 1291–1297. [Google Scholar]
- Jochim, C.; Deleris, L.A. Named Entity Recognition in the Medical Domain with Constrained CRF Models. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, 3–7 April 2017; pp. 839–849. [Google Scholar]
- Zhou, Y.; Jiang, W.; Song, P.; Su, Y.; Guo, T.; Han, J.; Hu, S. Graph Convolutional Networks for Target-oriented Opinion Words Extraction with Adversarial Training. In Proceedings of the 020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–7. [Google Scholar]
- Chutmongkolporn, K.; Manaskasemsak, B.; Rungsawang, A. Graph-based opinion entity ranking in customer reviews. International Symposium on Communications and Information Technologies. In Proceedings of the 2015 15th International Symposium on Communications and Information Technologies (ISCIT), Nara, Japan, 7–9 October 2015; pp. 161–164. [Google Scholar]
- Ansari, G.; Saxena, C.; Tanvir Ahmad, M.; Doja, N. Aspect Term Extraction using Graph-based Semi-Supervised Learning. Procedia Comput. Sci. 2020, 167, 2080–2090. [Google Scholar] [CrossRef]
- Samha, A.K.; Li, Y.; Zhang, J. Aspect-Based Opinion Mining from Product Reviews Using Conditional Random Fields. In Proceedings of the 13th Australasian Data Mining Conference, Sydney, Australia, 8–9 August 2015; pp. 119–128. [Google Scholar]
- Hu, K.; Ou, Z.; Hu, M.; Feng, J. Neural CRF Transducers for Sequence Labeling. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 2997–3001. [Google Scholar]
- Yin, Y.; Wei, F.; Li, D.; Xu, K.; Zhang, M.; Zhou, M. Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction. arXiv 2016, arXiv:1605.07843. [Google Scholar]
- Wang, W.; Pan, S.J.; Dahlmeier, D.; Xiao, X. Recursive Neural Conditional Random Fields for Aspect-based Sentiment Analysis. arXiv 2016, arXiv:1603.06679. [Google Scholar]
- Al-Smadi, M.; Talafha, B.; Al-Ayyoub, M.; Jararweh, Y. Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int. J. Mach. Learn. Cybern. 2019, 10, 2163–2175. [Google Scholar] [CrossRef]
- Hu, J.; Zheng, X. Opinion Extraction of Government Microblog Comments via BiLSTM-CRF Model. Joint Conference on Digital Libraries. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Wuhan, China, 19–23 June 2020; pp. 473–475. [Google Scholar]
- Contractor, D.; Patra, B.; Singla, P. Constrained BERT BiLSTM CRF for understanding multi-sentence entity-seeking questions. Nat. Lang. Eng. 2020, 27, 65–87. [Google Scholar] [CrossRef]
- Huang, Z.; Xu, W.; Yu, K. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv 2015, arXiv:1508.01991. [Google Scholar]
- Zhao, H.; Huang, L.; Zhang, R.; Lu, Q. SpanMlt: A Span-based Multi-Task Learning Framework for Pair-wise Aspect and Opinion Terms Extraction. ACL. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 5–10 July 2020; pp. 3239–3248. [Google Scholar]
- Lafferty, J.; McCallum, A.; Pereira, F.C. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the International Conference on Machine Learning (ICML), Williamstown, MA, USA, 28 June–1 July 2001; pp. 282–289. [Google Scholar]
- Baksi, R.P.; Upadhyaya, S.J. Decepticon: A Hidden Markov Model Approach to Counter Advanced Persistent Threats. Commun. Comput. Inf. Sci. 2019, 1186, 38–54. [Google Scholar]
- Alzaidy, R.; Caragea, C.; Giles, C.L. Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 2551–2557. [Google Scholar]
- Jakob, N.; Gurevych, I. Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, San Francisco, CA, USA, 13–17 May 2010; pp. 1035–1045. [Google Scholar]
- Tran, T.U.; Hoang, H.T.T.; Huynh, H.X. Aspect Extraction with Bidirectional GRU and CRF. In Proceedings of the 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), Danang, Vietnam, 20–22 March 2019; pp. 1–5. [Google Scholar]
- Veyseh, A.P.B.; Nouri, N.; Dernoncourt, F.; Dou, D.; Nguyen, T.H. Introducing Syntactic Structures into Target Opinion Word Extraction with Deep Learning. arXiv 2020, arXiv:2010.13378. [Google Scholar]
- Wang, W.; Pan, S.J.; Dahlmeier, D.; Xiao, X. Coupled Multi-Layer Attentions for Co-Extraction of Aspect and Opinion Terms. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 3316–3322. [Google Scholar]
- Wu, Z.; Zhao, F.; Dai, X.Y.; Huang, S.; Chen, J. Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 9298–9305. [Google Scholar]
Method | Dataset | Precision | Recall | F1 | |
---|---|---|---|---|---|
Jakob [21] | CRF | movies | 66.3 | 59.2 | 62.5 |
web-services | 62.4 | 39.4 | 48.3 | ||
cars | 25.9 | 42.6 | 32.2 | ||
cameras | 42.3 | 43.1 | 42.6 | ||
Hu [10] | NCRF | CoNLL-2003 English | - | - | 91.66 |
CoNLL-2002 Dutch | - | - | 81.94 | ||
Tran [22] | BiGRU+CRF | SemEval-2014 Restaurant | - | - | 85.00 |
SemEval-2014 Laptop | - | - | 78.50 | ||
Wang [12] | RNCRF | SemEval-2014 Restaurant | - | - | 84.52 |
SemEval-2014 Laptop | - | - | 78.93 | ||
Veyseh [23] | ONG | SemEval-2014 Restaurant | 83.23 | 81.46 | 82.33 |
SemEval-2014 Laptop | 73.87 | 77.78 | 75.77 |
Word | Part of Speech | Emotion | Relation | Tag |
---|---|---|---|---|
But | NN | N | OP | O |
the | DT | N | NP | O |
stuff | NN | N | NP | T |
was | VB | N | VP | O |
so | RB | N | OP | O |
horrible | JJ | Y | NP | C |
to | TO | N | OP | O |
us | PRP | N | OP | O |
Name | Template |
---|---|
Template1 | w(−1)/w(0)/w(1)/w(−1,0)/w(0,1)/w(−1,0,1) |
p(−1)/p(0)/p(1)/p(−1,0)/p(0,1)/p(−1,0,1) | |
e(0)p(0) | |
Template2 | w(−1)/w(0)/w(1)/w(−1,0)/w(0,1)/w(−1,0,1) |
p(−1)/p(0)/p(1)/p(−1,0)/p(0,1)/p(−1,0,1) | |
r(−1)/r(0)/r(1)/r(−1,0)/r(0,1)/r(−1,0,1) | |
e(0)p(0) | |
Template3 | w(−1)/w(0)/w(1)/w(−2)/w(2)/w(1,0)/w(0,1)/ |
w(−1,0,1)/w(−2,−1,0)w(0,1,2)/w(−2,−1,0,1,2) | |
p(−1)/p(0)/p(1)/p(−2)/p(2)/p(1,0)/ph(0,1)/p(−1,0,1)/ | |
p(−2,−1,0)p(0,1,2)/p(−2,−1,0,1,2) | |
e(0)p(0) | |
Template4 | w(−1)/w(0)/w(1)/w(−2)/w(2)/w(1,0)/w(0,1)/ |
w(−1,0,1)/w(−2,−1,0)w(0,1,2)/w(−2,−1,0,1,2) | |
p(−1)/p(0)/p(1)/p(−2)/p(2)/p(1,0)/p(0,1)/p(−1,0,1)/ | |
p(−2,−1,0)p(0,1,2)/p(−2,−1,0,1,2) | |
r(−1)/r(0)/r(1)/r(−2)/r(2)/r(1,0)/r(0,1)/r(−1,0,1)/ | |
r(−2,−1,0)r(0,1,2)/r(−2,−1,0,1,2) | |
e(0)p(0) | |
Template5 | w(−1)/w(0)/w(1)/ w(−1,0) |
p(−2)/p(−1)/p(0)/p(1)/p(2)/p(1,0)/p(0,1)/p(−1,0,1) | |
r(−1)/r(0)/r(1)/r(−2)/r(2)/r(1,0)/r(0,1)/r(−1,0,1)/ | |
r(−2,−1,0)r(0,1,2)/r(−2,−1,0,1,2) | |
e(0)p(0) |
Template | Meaning |
---|---|
U01:%x[−1,0] | previous word |
U02:%x[0,0] | current word |
U03:%x[1,0] | next word |
U04:%x[−1,0]/%x[0,0] | Combination of previous and current word |
U05:%x[0,0]/%x[1,0] | Combination of current and next word |
U06:%x[−1,0]/%x[0,0]/%x[1,0] | Combination of previous, current and next word |
U07:%x[−1,1] | Part of speech of previous word |
U08:%x[0,1] | Part of speech of current word |
U09:%x[1,1] | Part of speech of next word |
U10:%x[−1,1]/%x[0,1] | Combination of previous and current word’s |
part of speech | |
U11:%x[0,1]/%x[1,1] | Combination of current and next word’s part |
of speech | |
U12:%x[1,1]/%x[0,1]/%x[1,1] | Combination of previous, current and next |
word’s part of speech | |
U13:%x[0,2]/%x[0,1] | The combination of the current word’s part |
of speech and whether it is an emotional word |
Dataset | The Number of | The Size of | The Size of |
---|---|---|---|
Sentences | the Training Set | the Test Set | |
Restaurant | 2000 | 1655 | 345 |
Laptop | 1452 | 1093 | 359 |
Restaurant | Laptop | |||||||
---|---|---|---|---|---|---|---|---|
Method | A | P | R | F1 | A | P | R | F1 |
RNCRF | - | - | - | 84.52 | - | - | - | 78.93 |
CMLS | - | - | - | 84.19 | - | - | - | 78.99 |
SpanMlt | - | - | - | 85.70 | - | - | - | 82.56 |
LOTN | - | 84.00 | 82.52 | 82.21 | - | 77.08 | 67.62 | 72.02 |
Template1(ours) | 96.92 | 90.02 | 89.16 | 89.58 | 96.45 | 87.39 | 81.10 | 84.12 |
Template2(ours) | 96.87 | 90.19 | 88.52 | 89.36 | 96.40 | 87.02 | 80.98 | 83.89 |
Template3(ours) | 96.39 | 88.17 | 87.41 | 87.79 | 96.17 | 86.17 | 79.76 | 82.84 |
Template4(ours) | 96.44 | 88.50 | 87.41 | 87.95 | 96.21 | 86.41 | 79.88 | 83.01 |
Template5(ours) | 97.14 | 90.67 | 90.00 | 90.33 | 96.95 | 89.32 | 83.66 | 86.40 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yan, Y.; Zhou, F.; Ge, Y.; Liu, C.; Feng, J. “Opinion Target: Opinion Word” Pairs Extraction Based on CRF. Symmetry 2021, 13, 251. https://doi.org/10.3390/sym13020251
Yan Y, Zhou F, Ge Y, Liu C, Feng J. “Opinion Target: Opinion Word” Pairs Extraction Based on CRF. Symmetry. 2021; 13(2):251. https://doi.org/10.3390/sym13020251
Chicago/Turabian StyleYan, Yan, Faguo Zhou, Yifan Ge, Cheng Liu, and Jingwu Feng. 2021. "“Opinion Target: Opinion Word” Pairs Extraction Based on CRF" Symmetry 13, no. 2: 251. https://doi.org/10.3390/sym13020251
APA StyleYan, Y., Zhou, F., Ge, Y., Liu, C., & Feng, J. (2021). “Opinion Target: Opinion Word” Pairs Extraction Based on CRF. Symmetry, 13(2), 251. https://doi.org/10.3390/sym13020251