Classification of Arabic Tweets: A Review
Abstract
:1. Introduction
2. Comparison with Other Surveys
3. Background Knowledge
3.1. Arabic Language
3.2. Arabic Dialect
- Has a more basic grammar and informal language and style
- Has several distinctly articulated letters that may vary on the basis of dialect
- Has terms or phrases that differ from some dialects
- Only in writing if an intimate or humorous touch is needed
3.2.1. Sudanese Arabic
3.2.2. Egyptian Arabic
3.2.3. Maghrebi Arabic
3.2.4. Gulf
3.2.5. Levantine
3.2.6. Yemeni Arabic
3.2.7. Mesopotamian
3.3. Text Classification
3.4. Data Gathering
3.5. Arabic Corpora
3.6. Exploring/Prepossessing Data
3.7. Train, and Evaluate Model
3.8. Deployment of Model
3.9. Machine Learning Algorithms
3.9.1. Supervised Learning
3.9.2. Unsupervised Learning
3.9.3. Semi-Supervised Learning
4. Machine Learning Techniques for Arabic Tweet Classification
4.1. Supervised Leaning Techniques
Discussion and Learned Lessons
4.2. Unsupervised Machine Learning Techniques
Discussion and Learned Lessons
4.3. Hybrid Machine Learning Techniques
Discussion and Learned Lesson
5. Lexicon Based Text Classification
Discussion and Learned Lessons
6. Challenges of Arabic Text Classification
6.1. Small Number of Comprehensive Data Sets
6.2. Sarcasm in Text
6.3. Compound Phrases and Idioms
6.4. Arabizi
6.5. Repetition of Words
6.6. Negations
6.7. Complex Morphology
7. Deep Learning for Arabic Sentiment Analysis
8. Transformer for Arabic Text
9. Future Research Directions
- By using deep learning, a new hybrid approach can be developed. Big data applications and technology, such as MapReduce and Hadoop, can solve any of the current problems in Arabic sentiment analysis.
- Research and study of sentiments as highlighted in this survey to get the optimal Arabic Sentiment Analysis (ASA) method.
- Most of the techniques rely on manually assembled resources; it is needed to propose new systems to automatically create resources automatically.
- There are several Arabic dialects, mostly these is processed individually. We need to propose methods and techniques that can process all dialects.
- In most of the research studies, researchers follow the way of construction. It is needed to find a way, how to use existing resources for the construction.
- Deep learning approaches are very much promising in different fields of human life, health care, agriculture, image processing. A little work is done on these Arabic text techniques, so it is a very promising area to find appropriate deep learning methods for Arabic text processing.
- For the English language, several applications are operating based on the NLP paradigm, in contrast, Arabic text processing does not get much importance. To fill this gap, the research community should target such applications to process Arabic text.
- Due to the large-scale usage of the internet and social media, a new form of Arabic text is evolved known as Arabizi ( derived from Arabic dialects speaking and written in Latin words). This Arabizi is widely used in tweets, it is needed to work on the detection and analysis of these tweets.
- Some enterprise tools and software should be developed for Arabic text to enhance different product sales by analyzing user comments and reviews.
- More large dictionaries and data sets should be considered for Arabic text analysis.
- A big corpus can be built that have multi dialect Arabic data and used for the evaluation purposes of techniques.
- Hybrid models can be used for the detection of negation in text for more reliable results.
- More research should be carried on semantic analysis, because the same word may have multiple meanings in different contexts.
10. Conclusions
Funding
Conflicts of Interest
References
- Greenwood, S.; Perrin, A.; Duggan, M. Social media update 2016. Pew Res. Cent. 2016, 11, 1–18. [Google Scholar]
- Asur, S.; Huberman, B.A. Predicting the future with social media. In Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Toronto, ON, Canada, 31 August–3 September 2010; Volume 1, pp. 492–499. [Google Scholar]
- Fuchs, C. Social Media: A Critical Introduction; Sage: Thousand Oaks, CA, USA, 2017. [Google Scholar]
- Tartir, S.; Abdul-Nabi, I. Semantic sentiment analysis in Arabic social media. J. King Saud-Univ.-Comput. Inf. Sci. 2017, 29, 229–233. [Google Scholar] [CrossRef]
- Hughes, D.J.; Rowe, M.; Batey, M.; Lee, A. A tale of two sites: Twitter vs. Facebook and the personality predictors of social media usage. Comput. Hum. Behav. 2012, 28, 561–569. [Google Scholar] [CrossRef] [Green Version]
- Griffis, H.M.; Kilaru, A.S.; Werner, R.M.; Asch, D.A.; Hershey, J.C.; Hill, S.; Ha, Y.P.; Sellers, A.; Mahoney, K.; Merchant, R.M. Use of social media across US hospitals: Descriptive analysis of adoption and utilization. J. Med. Internet Res. 2014, 16, e264. [Google Scholar] [CrossRef]
- Elnagar, A.; Al-Debsi, R.; Einea, O. Arabic text classification using deep learning models. Inf. Process. Manag. 2020, 57, 102121. [Google Scholar] [CrossRef]
- Abd Al-Aziz, A.M.; Gheith, M.; Eldin, A.S. Lexicon based and multi-criteria decision making (MCDM) approach for detecting emotions from Arabic microblog text. In Proceedings of the 2015 First International Conference on Arabic Computational Linguistics (ACLing), Cairo, Egypt, 17–20 April 2015; pp. 100–105. [Google Scholar]
- Neri, F.; Aliprandi, C.; Capeci, F.; Cuadros, M.; By, T. Sentiment Analysis on Social Media. ASONAM 2012, 12, 919–926. [Google Scholar]
- Yu, Y.; Duan, W.; Cao, Q. The impact of social and conventional media on firm equity value: A sentiment analysis approach. Decis. Support Syst. 2013, 55, 919–926. [Google Scholar] [CrossRef]
- Yue, L.; Chen, W.; Li, X.; Zuo, W.; Yin, M. A survey of sentiment analysis in social media. Knowl. Inf. Syst. 2019, 60, 617–663. [Google Scholar] [CrossRef]
- Al-Radaideh, Q. Applications of Mining Arabic Text: A Review. In Recent Trends in Computational Intelligence; IntechOpen: London, UK, 2020. [Google Scholar]
- Shehab, M.A.; Badarneh, O.; Al-Ayyoub, M.; Jararweh, Y. A supervised approach for multi-label classification of Arabic news articles. In Proceedings of the 2016 7th International Conference on Computer Science and Information Technology (CSIT), Amman, Jordan, 13–16 July 2016; pp. 1–6. [Google Scholar]
- Ahmed, N.A.; Shehab, M.A.; Al-Ayyoub, M.; Hmeidi, I. Scalable multi-label arabic text classification. In Proceedings of the 2015 6th International Conference on Information and Communication Systems (ICICS), Amman, Jordan, 7–9 July 2015; pp. 212–217. [Google Scholar]
- Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of tricks for efficient text classification. arXiv 2016, arXiv:1607.01759. [Google Scholar]
- Abdullah, M.; Hadzikadic, M. Sentiment analysis on arabic tweets: Challenges to dissecting the language. In Proceedings of the International Conference on Social Computing and Social Media, Vancouver, BC, Canada, 9–14 July 2017; pp. 191–202. [Google Scholar]
- Al-Moslmi, T.; Omar, N.; Abdullah, S.; Albared, M. Approaches to cross-domain sentiment analysis: A systematic literature review. IEEE Access 2017, 5, 16173–16192. [Google Scholar] [CrossRef]
- Almuqren, L.; Alzammam, A.; Alotaibi, S.; Cristea, A.; Alhumoud, S. A review on corpus annotation for Arabic sentiment analysis. In Proceedings of the International Conference on Social Computing and Social Media, Vancouver, BC, Canada, 9–14 July 2017; pp. 215–225. [Google Scholar]
- Alnawas, A.; Arici, N. The corpus based approach to sentiment analysis in modern standard Arabic and Arabic dialects: A literature review. Politek. Derg. 2018, 21, 461–470. [Google Scholar] [CrossRef]
- Alhumoud, S.O.; Altuwaijri, M.I.; Albuhairi, T.M.; Alohaideb, W.M. Survey on arabic sentiment analysis in twitter. Int. Sci. Index 2015, 9, 364–368. [Google Scholar]
- Assiri, A.; Emam, A.; Aldossari, H. Arabic Sentiment Analysis: A Survey. Int. J. Adv. Comput. Sci. Appl. 2015, 6. [Google Scholar] [CrossRef] [Green Version]
- Al-Ayyoub, M.; Nuseir, A.; Alsmearat, K.; Jararweh, Y.; Gupta, B. Deep learning for Arabic NLP: A survey. J. Comput. Sci. 2018, 26, 522–531. [Google Scholar] [CrossRef]
- Guellil, I.; Saâdane, H.; Azouaou, F.; Gueni, B.; Nouvel, D. Arabic natural language processing: An overview. J. King Saud-Univ.-Comput. Inf. Sci. 2019. [Google Scholar] [CrossRef]
- Badaro, G.; Baly, R.; Hajj, H.; El-Hajj, W.; Shaban, K.B.; Habash, N.; Al-Sallab, A.; Hamdi, A. A survey of opinion mining in Arabic: A comprehensive system perspective covering challenges and advances in tools, resources, models, applications, and visualizations. ACM Trans. Asian-Low-Resour. Lang. Inf. Process. (TALLIP) 2019, 18, 1–52. [Google Scholar] [CrossRef] [Green Version]
- Al-Twairesh, N.; Al-Khalifa, H.; Al-Salman, A. Subjectivity and sentiment analysis of Arabic: Trends and challenges. In Proceedings of the 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA), Doha, Qatar, 10–13 November 2014; pp. 148–155. [Google Scholar]
- Kaseb, G.S.; Ahmed, M.F. Arabic sentiment analysis approaches: An analytical survey. Int. J. Sci. Eng. Res. 2016, 7, 712–723. [Google Scholar]
- El-Masri, M.; Altrabsheh, N.; Mansour, H. Successes and challenges of Arabic sentiment analysis research: A literature review. Soc. Netw. Anal. Min. 2017, 7, 54. [Google Scholar] [CrossRef]
- Dalila, B.; Mohamed, A.; Bendjanna, H. A review of recent aspect extraction techniques for opinion mining systems. In Proceedings of the 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP), Algiers, Algeria, 25–26 April 2018; pp. 1–6. [Google Scholar]
- Hamdi, A.; Shaban, K.; Zainal, A. A Review on Challenging Issues in Arabic Sentiment Analysis. J. Comput. Sci. 2016. [Google Scholar] [CrossRef] [Green Version]
- Ghallab, A.; Mohsen, A.; Ali, Y. Arabic Sentiment Analysis: A Systematic Literature Review. Appl. Comput. Intell. Soft Comput. 2020, 2020. [Google Scholar] [CrossRef] [Green Version]
- Abo, M.E.M.; Raj, R.G.; Qazi, A. A Review on Arabic Sentiment Analysis: State-of-the-Art, Taxonomy and Open Research Challenges. IEEE Access 2019, 7, 162008–162024. [Google Scholar] [CrossRef]
- Alsayat, A.; Elmitwally, N. A comprehensive study for Arabic Sentiment Analysis (Challenges and Applications). Egypt. Inform. J. 2020, 21, 7–12. [Google Scholar] [CrossRef]
- Abdul-Mageed, M.; Alhuzali, H.; Elaraby, M. You tweet what you speak: A city-level dataset of arabic dialects. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7–12 May 2018. [Google Scholar]
- Harrat, S.; Meftouh, K.; Smaili, K. Machine translation for Arabic dialects (survey). Inf. Process. Manag. 2019, 56, 262–273. [Google Scholar] [CrossRef] [Green Version]
- Alkhair, M.; Meftouh, K.; Smaïli, K.; Othman, N. An arabic corpus of fake news: Collection, analysis and classification. In Proceedings of the International Conference on Arabic Language Processing, Nancy, France, 16–17 October 2019; pp. 292–302. [Google Scholar]
- Zeroual, I.; Lakhouaja, A. Arabic corpus linguistics: Major progress, but still a long way to go. In Intelligent Natural Language Processing: Trends and Applications; Springer: Berlin/Heidelberg, Germany, 2018; pp. 613–636. [Google Scholar]
- Aggarwal, C.C.; Zhai, C. A survey of text classification algorithms. In Mining Text Data; Springer: Berlin/Heidelberg, Germany, 2012; pp. 163–222. [Google Scholar]
- Ikonomakis, M.; Kotsiantis, S.; Tampakas, V. Text classification using machine learning techniques. WSEAS Trans. Comput. 2005, 4, 966–974. [Google Scholar]
- Kowsari, K.; Jafari Meimandi, K.; Heidarysafa, M.; Mendu, S.; Barnes, L.; Brown, D. Text classification algorithms: A survey. Information 2019, 10, 150. [Google Scholar] [CrossRef] [Green Version]
- Boukil, S.; Biniz, M.; El Adnani, F.; Cherrat, L.; El Moutaouakkil, A.E. Arabic text classification using deep learning technics. Int. J. Grid Distrib. Comput. 2018, 11, 103–114. [Google Scholar] [CrossRef]
- Castillo, C.; Mendoza, M.; Poblete, B. Information Credibility on Twitter. In Proceedings of the 20th International Conference on World Wide Web, WWW ’11, Hyderabad, India, 28 March–1 April 2011; Association for Computing Machinery: New York, NY, USA, 2011; pp. 675–684. [Google Scholar] [CrossRef]
- Habash, N.; Sadat, F. Arabic preprocessing schemes for statistical machine translation. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers; Association for Computational Linguistics: New York, NY, USA, 2006; pp. 49–52. [Google Scholar]
- Dukes, K.; Habash, N. Morphological Annotation of Quranic Arabic. In Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, Valletta, Malta, 17–23 May 2010. [Google Scholar]
- Traboulsi, H. Arabic named entity extraction: A local grammar-based approach. In Proceedings of the 2009 International Multiconference on Computer Science and Information Technology, Mragowo, Poland, 12–14 October 2009; pp. 139–143. [Google Scholar]
- McNeil, K. Tunisian arabic corpus: Creating a written corpus of an ‘unwritten’language. In Arabic Corpus Linguistics; Edinburgh University Press: Edinburgh, UK, 2018; Volume 30. [Google Scholar]
- Alansary, S.; Nagi, M.; Adly, N. Building an International Corpus of Arabic (ICA): Progress of compilation stage. In Proceedings of the 7th International Conference on Language Engineering, Cairo, Egypt, 5–6 December 2007; pp. 5–6. [Google Scholar]
- Ahmed, M.A.; Hasan, R.A.; Ali, A.H.; Mohammed, M.A. The classification of the modern arabic poetry using machine learning. Telkomnika 2019, 17, 2667–2674. [Google Scholar] [CrossRef] [Green Version]
- Elhassan, R.; Ahmed, M. Arabic text classification on full word. Int. J. Comput. Sci. Softw. Eng. (IJCSSE) 2015, 4, 114–120. [Google Scholar]
- Baier, L.; Jöhren, F.; Seebacher, S. Challenges in the deployment and operation of machine learning in practice. In Proceedings of the 27th European Conference on Information Systems (ECIS), Stockholm & Uppsala, Sweden, 8–14 June 2019. [Google Scholar]
- Baltrušaitis, T.; Ahuja, C.; Morency, L.P. Multimodal machine learning: A survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 423–443. [Google Scholar] [CrossRef] [Green Version]
- Aggarwal, C.C.; Zhai, C. Mining Text Data; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
- Zhang, J.; Zhan, Z.H.; Lin, Y.; Chen, N.; Gong, Y.J.; Zhong, J.H.; Chung, H.S.; Li, Y.; Shi, Y.H. Evolutionary computation meets machine learning: A survey. IEEE Comput. Intell. Mag. 2011, 6, 68–75. [Google Scholar] [CrossRef]
- Pan, W.; Zhong, E.; Yang, Q. Transfer learning for text mining. In Mining Text Data; Springer: Berlin/Heidelberg, Germany, 2012; pp. 223–257. [Google Scholar]
- Khan, A.; Baharudin, B.; Lee, L.H.; Khan, K. A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. 2010, 1, 4–20. [Google Scholar]
- Das, K.; Behera, R.N. A survey on machine learning: Concept, algorithms and applications. Int. J. Innov. Res. Comput. Commun. Eng. 2017, 5, 1301–1309. [Google Scholar]
- Baydin, A.G.; Pearlmutter, B.A.; Radul, A.A.; Siskind, J.M. Automatic differentiation in machine learning: A survey. J. Mach. Learn. Res. 2017, 18, 5595–5637. [Google Scholar]
- Wang, P.; Li, Y.; Reddy, C.K. Machine learning for survival analysis: A survey. ACM Comput. Surv. (CSUR) 2019, 51, 1–36. [Google Scholar] [CrossRef]
- Benchettara, N.; Kanawati, R.; Rouveirol, C. Supervised machine learning applied to link prediction in bipartite social networks. In Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, Denmark, 9–10 August 2010; pp. 326–330. [Google Scholar]
- Singh, A.; Thakur, N.; Sharma, A. A review of supervised machine learning algorithms. In Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), India, New Delhi, 16–18 March 2016; pp. 1310–1315. [Google Scholar]
- Cheng, M.Y.; Kusoemo, D.; Gosno, R.A. Text mining-based construction site accident classification using hybrid supervised machine learning. Autom. Constr. 2020, 118, 103265. [Google Scholar] [CrossRef]
- Jaeger, S.; Fulle, S.; Turk, S. Mol2vec: Unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 2018, 58, 27–35. [Google Scholar] [CrossRef] [PubMed]
- Janasik, N.; Honkela, T.; Bruun, H. Text mining in qualitative research: Application of an unsupervised learning method. Organ. Res. Methods 2009, 12, 436–460. [Google Scholar] [CrossRef] [Green Version]
- Goseva-Popstojanova, K.; Tyo, J. Identification of security related bug reports via text mining using supervised and unsupervised classification. In Proceedings of the 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS), Lisbon, Portugal, 16–20 July 2018; pp. 344–355. [Google Scholar]
- Huo, H.; Rong, Z.; Kononova, O.; Sun, W.; Botari, T.; He, T.; Tshitoyan, V.; Ceder, G. Semi-supervised machine-learning classification of materials synthesis procedures. NPJ Comput. Mater. 2019, 5, 1–7. [Google Scholar] [CrossRef] [Green Version]
- Wu, C.; Wu, F.; Wu, S.; Yuan, Z.; Liu, J.; Huang, Y. Semi-supervised dimensional sentiment analysis with variational autoencoder. Knowl.-Based Syst. 2019, 165, 30–39. [Google Scholar] [CrossRef]
- Yilmaz, C.M.; Durahim, A.O. SPR2EP: A semi-supervised spam review detection framework. In Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain, 28–31 August 2018; pp. 306–313. [Google Scholar]
- Li, Y.; Pan, Q.; Wang, S.; Peng, H.; Yang, T.; Cambria, E. Disentangled variational auto-encoder for semi-supervised learning. Inf. Sci. 2019, 482, 73–85. [Google Scholar] [CrossRef] [Green Version]
- Dalal, M.K.; Zaveri, M.A. Automatic text classification: A technical review. Int. J. Comput. Appl. 2011, 28, 37–40. [Google Scholar] [CrossRef]
- Agarwal, B.; Mittal, N. Text classification using machine learning methods-a survey. In Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), Jaipur, India, 28–30 December 2012; pp. 701–709. [Google Scholar]
- Tong, S.; Koller, D. Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2001, 2, 45–66. [Google Scholar]
- Duwairi, R.M.; Qarqaz, I. A framework for Arabic sentiment analysis using supervised classification. Int. J. Data Min. Model. Manag. 2016, 8, 369–381. [Google Scholar]
- Atoum, J.O.; Nouman, M. Sentiment analysis of Arabic jordanian dialect tweets. Int. J. Adv. Comput. Sci. Appl. 2019, 10, 256–262. [Google Scholar] [CrossRef] [Green Version]
- Jardaneh, G.; Abdelhaq, H.; Buzz, M.; Johnson, D. Classifying Arabic tweets based on credibility using content and user features. In Proceedings of the 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Jordan, Amman, 9–11 April 2019; pp. 596–601. [Google Scholar]
- Al-Horaibi, L.; Khan, M.B. Sentiment analysis of Arabic tweets using text mining techniques. In Proceedings of the First International Workshop on Pattern Recognition. International Society for Optics and Photonics, Tokyo, Japan, 11–13 May 2016; Volume 10011, p. 100111F. [Google Scholar]
- Abdelaal, H.M.; Elmahdy, A.N.; Halawa, A.A.; Youness, H.A. Improve the automatic classification accuracy for Arabic tweets using ensemble methods. J. Electr. Syst. Inf. Technol. 2018, 5, 363–370. [Google Scholar] [CrossRef]
- Alsanad, A. Arabic Topic Detection Using Discriminative Multi nominal Naïve Bayes and Frequency Transforms. In Proceedings of the 2018 International Conference on Signal Processing and Machine Learning, Shanghai, China, 28–30 November 2018; pp. 17–21. [Google Scholar]
- Duwairi, R.M.; Qarqaz, I. Arabic sentiment analysis using supervised classification. In Proceedings of the 2014 International Conference on Future Internet of Things and Cloud, Barcelona, Spain, 27–29 August 2014; pp. 579–583. [Google Scholar]
- Ismail, R.; Omer, M.; Tabir, M.; Mahadi, N.; Amin, I. Sentiment analysis for arabic dialect using supervised learning. In Proceedings of the 2018 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE), Sudan, Khartoum, 12–14 August 2018; pp. 1–6. [Google Scholar]
- Alsaleem, S. Automated Arabic Text Categorization Using SVM and NB. Int. Arab. J. Technol. 2011, 2, 124–128. [Google Scholar]
- Salamah, J.B.; Elkhlifi, A. Microblogging opinion mining approach for kuwaiti dialect. In Proceedings of the International Conference on Computing Technology and Information Management (ICCTIM), Dubai, United Arab Emirates, 9–11 April 2014; p. 388. [Google Scholar]
- Al-Osaimi, S.; Badruddin, K.M. Role of Emotion icons in Sentiment classification of Arabic Tweets. In Proceedings of the 6th International Conference on Management of Emergent Digital Ecosystems, Buraidah Al Qassim, Saudi Arabia, 15-17 September 2014; pp. 167–171. [Google Scholar]
- Abdul-Mageed, M.; Diab, M.; Kübler, S. SAMAR: Subjectivity and sentiment analysis for Arabic social media. Comput. Speech Lang. 2014, 28, 20–37. [Google Scholar] [CrossRef]
- Shoukry, A.; Rafea, A. Sentence-level Arabic sentiment analysis. In Proceedings of the 2012 International Conference on Collaboration Technologies and Systems (CTS), Denver, CO, USA, 21–25 May 2012; pp. 546–550. [Google Scholar] [CrossRef]
- Oussous, A.; Benjelloun, F.Z.; Lahcen, A.A.; Belfkih, S. ASA: A framework for Arabic sentiment analysis. J. Inf. Sci. 2020, 46, 544–559. [Google Scholar] [CrossRef]
- Ombabi, A.H.; Ouarda, W.; Alimi, A.M. Deep learning CNN—LSTM framework for Arabic sentiment analysis using textual information shared in social networks. Soc. Netw. Anal. Min. 2020, 10, 1–13. [Google Scholar] [CrossRef]
- Harrag, F.; El-Qawasmeh, E.; Pichappan, P. Improving Arabic text categorization using decision trees. In Proceedings of the 2009 First International Conference on Networked Digital Technologies, Ostrava, Czech Republic, 29–31 July 2009; pp. 110–115. [Google Scholar]
- Saad, M.K.; Ashour, W.M. Arabic text classification using decision trees. Arab. Text Classif. Using Decis. Trees 2010, 2. [Google Scholar]
- Elawady, R.M.; Barakat, S.; Nora, M.E. Sentiment analyzer for arabic comments. Int. J. Inf. Sci. Intell. Syst. 2014, 3, 73–86. [Google Scholar]
- Hammad, M.; Al-awadi, M. Sentiment analysis for arabic reviews in social networks using machine learning. In Information Technology: New Generations; Springer: Berlin/Heidelberg, Germany, 2016; pp. 131–139. [Google Scholar]
- Abdullah, M.; AlMasawa, M.; Makki, I.; Alsolmi, M.; Mahrous, S. Emotions extraction from Arabic tweets. Int. J. Comput. Appl. 2020, 42, 661–675. [Google Scholar] [CrossRef]
- Helmy, T.; Daud, A. Intelligent agent for information extraction from Arabic text without machine translation. In Proceedings of the 1st International Workshop on Cross-Cultural and Cross-Lingual Aspects of the Semantic Web, Shanghai, China, 7 November 2010; Volume 1, p. C3LSW2010. [Google Scholar]
- Gentleman, R.; Carey, V.J. Unsupervised machine learning. In Bioconductor Case Studies; Springer: Berlin/Heidelberg, Germany, 2008; pp. 137–157. [Google Scholar]
- Al-Azzawy, D.S.; Al-Rufaye, F.M.L. Arabic words clustering by using K-means algorithm. In Proceedings of the 2017 Annual Conference on New Trends in Information & Communications Technology Applications (NTICT), Baghdad, Iraq, 7–9 March 2017; pp. 263–267. [Google Scholar]
- Alzanin, S.M.; Azmi, A.M. Rumor detection in Arabic tweets using semi-supervised and unsupervised expectation–maximization. Knowl.-Based Syst. 2019, 185, 104945. [Google Scholar] [CrossRef]
- Abuaiadah, D. Using bisect k-means clustering technique in the analysis of Arabic documents. ACM Trans. Asian-Low-Resour. Lang. Inf. Process. (TALLIP) 2016, 15, 1–13. [Google Scholar] [CrossRef]
- Mostafa, M.M. Clustering halal food consumers: A Twitter sentiment analysis. Int. J. Mark. Res. 2019, 61, 320–337. [Google Scholar] [CrossRef]
- Sangaiah, A.K.; Fakhry, A.E.; Abdel-Basset, M.; El-henawy, I. Arabic text clustering using improved clustering algorithms with dimensionality reduction. Clust. Comput. 2019, 22, 4535–4549. [Google Scholar] [CrossRef]
- Abuaiadah, D.; Rajendran, D.; Jarrar, M. Clustering Arabic tweets for sentiment analysis. In Proceedings of the 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), Hammamet, Tunisia, 30 October–3 November 2017; pp. 449–456. [Google Scholar]
- Elarnaoty, M.; AbdelRahman, S.; Fahmy, A. A machine learning approach for opinion holder extraction in Arabic language. arXiv 2012, arXiv:1206.1011. [Google Scholar] [CrossRef]
- Oraby, S.; El-Sonbaty, Y.; Abou El-Nasr, M. Finding opinion strength using rule-based parsing for arabic sentiment analysis. In Proceedings of the Mexican International Conference on Artificial Intelligence, Mexico City, Mexico, 24–30 November 2013; pp. 509–520. [Google Scholar]
- El-Halees, A.M. Arabic opinion mining using combined classification approach. In Arabic Opinion Mining Using Combined Classification Approach; Naif Arab University for Security Sciences: Riyadh, Saudi Arabia, 2011. [Google Scholar]
- Huang, F. Improved Arabic dialect classification with social media data. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 2118–2126. [Google Scholar]
- Salloum, S.A.; Al-Emran, M.; Abdallah, S.; Shaalan, K. Analyzing the Arab gulf newspapers using text mining techniques. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, Cairo, Egypt, 9–11 September 2017; pp. 396–405. [Google Scholar]
- Marie-Sainte, S.L.; Alalyani, N.; Alotaibi, S.; Ghouzali, S.; Abunadi, I. Arabic natural language processing and machine learning-based systems. IEEE Access 2018, 7, 7011–7020. [Google Scholar] [CrossRef]
- Aljarah, I.; Habib, M.; Hijazi, N.; Faris, H.; Qaddoura, R.; Hammo, B.; Abushariah, M.; Alfawareh, M. Intelligent detection of hate speech in Arabic social network: A machine learning approach. J. Inf. Sci. 2020, 0165551520917651. [Google Scholar] [CrossRef]
- Silva, E.F.; Barros, F.A.; Prudencio, R.B. A hybrid machine learning approach for information extraction. In Proceedings of the 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS’06), Rio de Janeiro, Brazil, 13–15 December 2006; p. 44. [Google Scholar]
- Remeikis, N.; Skucas, I.; Melninkaitè, V. Hybrid machine learning approach for text categorization. Int. J. Comput. Intell. 2005, 1, 63–67. [Google Scholar]
- Aldayel, H.K.; Azmi, A.M. Arabic tweets sentiment analysis–a hybrid scheme. J. Inf. Sci. 2016, 42, 782–797. [Google Scholar] [CrossRef]
- Thabtah, F.; Gharaibeh, O.; Al-Zubaidy, R. Arabic text mining using rule based classification. J. Inf. Knowl. Manag. 2012, 11, 1250006. [Google Scholar] [CrossRef] [Green Version]
- Elshakankery, K.; Ahmed, M.F. HILATSA: A hybrid incremental learning approach for Arabic tweets sentiment analysis. Egypt. Inform. J. 2019, 20, 163–171. [Google Scholar] [CrossRef]
- Shaalan, K.; Oudah, M. A hybrid approach to Arabic named entity recognition. J. Inf. Sci. 2014, 40, 67–87. [Google Scholar] [CrossRef] [Green Version]
- Hadni, M.; Ouatik, S.A.; Lachkar, A. Effective Arabic stemmer based hybrid approach for Arabic text categorization. Int. J. Data Min. Knowl. Manag. Process. 2013, 3, 1. [Google Scholar] [CrossRef]
- Al-Saqqa, S.; Obeid, N.; Awajan, A. Sentiment analysis for Arabic text using ensemble learning. In Proceedings of the 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA), Aqaba, Jordan, 28 October–1 November 2018; pp. 1–7. [Google Scholar]
- Altaher, A. Hybrid approach for sentiment analysis of Arabic tweets based on deep learning model and features weighting. Int. J. Adv. Appl. Sci. 2017, 4, 43–49. [Google Scholar] [CrossRef]
- Biltawi, M.; Al-Naymat, G.; Tedmori, S. Arabic sentiment classification: A hybrid approach. In Proceedings of the 2017 International Conference On New Trends In Computing Sciences (ICTCS), Amman, Jordan, 11–13 October 2017; pp. 104–108. [Google Scholar]
- Alhumoud, S.; Albuhairi, T.; Altuwaijri, M. Arabic sentiment analysis using WEKA a hybrid learning approach. In Proceedings of the 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), Lisbon, Portugal, 12–14 November 2015; Volume 1, pp. 402–408. [Google Scholar]
- Salloum, S.A.; Mhamdi, C.; Al-Emran, M.; Shaalan, K. Analysis and classification of Arabic newspapers’ Facebook pages using text mining techniques. Int. J. Inf. Technol. Lang. Stud. 2017, 1, 8–17. [Google Scholar]
- El-Makky, N.; Nagi, K.; El-Ebshihy, A.; Apady, E.; Hafez, O.; Mostafa, S.; Ibrahim, S. Sentiment analysis of colloquial Arabic tweets. In Proceedings of the ASE BigData/SocialInformatics/PASSAT/BioMedCom 2014 Conference, Harvard University, Cambridge, MA, USA, 14–16 December 2014; pp. 1–9. [Google Scholar]
- Khalifa, K.; Omar, N. A hybrid method using lexicon-based approach and Naive Bayes classifier for Arabic opinion question answering. J. Comput. Sci. 2014, 10, 1961–1968. [Google Scholar] [CrossRef] [Green Version]
- Elzayady, H.; Badran, K.M.; Salama, G.I. Arabic Opinion Mining Using Combined CNN-LSTM Models. Int. J. Intell. Syst. Appl. 2020, 4, 25–36. [Google Scholar] [CrossRef]
- Al-Smadi, M.; Al-Zboon, S.; Jararweh, Y.; Juola, P. Transfer Learning for Arabic Named Entity Recognition With Deep Neural Networks. IEEE Access 2020, 8, 37736–37745. [Google Scholar] [CrossRef]
- Nahar, K.M.; Jaradat, A.; Atoum, M.S.; Ibrahim, F. Sentiment analysis and classification of arab jordanian facebook comments for jordanian telecom companies using lexicon-based approach and machine learning. Jordanian J. Comput. Inf. Technol. (JJCIT) 2020, 6. [Google Scholar] [CrossRef]
- Binsaeed, K.; Stringhini, G.; Youssef, A.E. Detecting Spam in Twitter Microblogging Services: A Novel Machine Learning Approach based on Domain Popularity. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 2020. [Google Scholar] [CrossRef]
- Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.; Stede, M. Lexicon-based methods for sentiment analysis. Comput. Linguist. 2011, 37, 267–307. [Google Scholar] [CrossRef]
- Abdulla, N.A.; Ahmed, N.A.; Shehab, M.A.; Al-Ayyoub, M. Arabic sentiment analysis: Lexicon-based and corpus-based. In Proceedings of the 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), Amman, Jordan, 3–5 December 2013; pp. 1–6. [Google Scholar]
- Al-Ayyoub, M.; Essa, S.B.; Alsmadi, I. Lexicon-based sentiment analysis of arabic tweets. Int. J. Soc. Netw. Min. 2015, 2, 101–114. [Google Scholar] [CrossRef]
- Mataoui, M.; Zelmati, O.; Boumechache, M. A proposed lexicon-based sentiment analysis approach for the vernacular Algerian Arabic. Res. Comput. Sci. 2016, 110, 55–70. [Google Scholar] [CrossRef]
- Duwairi, R.M.; Ahmed, N.A.; Al-Rifai, S.Y. Detecting sentiment embedded in Arabic social media–a lexicon-based approach. J. Intell. Fuzzy Syst. 2015, 29, 107–117. [Google Scholar] [CrossRef]
- Badaro, G.; Baly, R.; Akel, R.; Fayad, L.; Khairallah, J.; Hajj, H.; Shaban, K.; El-Hajj, W. A light lexicon-based mobile application for sentiment mining of arabic tweets. In Proceedings of the Second Workshop on Arabic Natural Language Processing, Beijing, China, 30 July 2015; pp. 18–25. [Google Scholar]
- Hmeidi, I.; Al-Ayyoub, M.; Mahyoub, N.A.; Shehab, M.A. A lexicon based approach for classifying Arabic multi-labeled text. Int. J. Web Inf. Syst. 2016. [Google Scholar] [CrossRef]
- Abdulla, N.; Majdalawi, R.; Mohammed, S.; Al-Ayyoub, M.; Al-Kabi, M. Automatic Lexicon Construction for Arabic Sentiment Analysis. In Proceedings of the 2014 International Conference on Future Internet of Things and Cloud, Barcelona, Spain, 27–29 August 2014; pp. 547–552. [Google Scholar] [CrossRef]
- Al-Smadi, M.; Talafha, B.; Al-Ayyoub, M.; Jararweh, Y. Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int. J. Mach. Learn. Cybern. 2019, 10, 2163–2175. [Google Scholar] [CrossRef]
- Alayba, A.M.; Palade, V.; England, M.; Iqbal, R. Improving sentiment analysis in Arabic using word representation. In Proceedings of the 2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR), London, UK, 12–14 March 2018; pp. 13–18. [Google Scholar]
- Abdulla, N.A.; Ahmed, N.A.; Shehab, M.A.; Al-Ayyoub, M.; Al-Kabi, M.N.; Al-rifai, S. Towards improving the lexicon-based approach for arabic sentiment analysis. Int. J. Inf. Technol. Web Eng. (IJITWE) 2014, 9, 55–71. [Google Scholar] [CrossRef] [Green Version]
- Ibrahim, H.S.; Abdou, S.M.; Gheith, M. Sentiment analysis for modern standard arabic and colloquial. arXiv 2015, arXiv:1505.03105. [Google Scholar] [CrossRef]
- Mohammad, S.M.; Salameh, M.; Kiritchenko, S. How translation alters sentiment. J. Artif. Intell. Res. 2016, 55, 95–130. [Google Scholar] [CrossRef]
- Aloqaily, A.; Alhassan, M.; Salah, K.; Elshqeirat, B.; Almashagbah, M. Sentiment analysis for arabic tweets datasets: Lexicon-based and machine learning approaches. J. Theor. Appl. Inf. Technol. 2020, 98, 114–122. [Google Scholar]
- Alhammi, H.A.; Haddar, K. Building a Libyan Dialect Lexicon-Based Sentiment Analysis System Using Semantic Orientation of Adjective-Adverb Combinations. Int. J. Comput. Theory Eng. 2020, 12. [Google Scholar] [CrossRef]
- Touahri, I.; Mazroui, A. Deep analysis of an Arabic sentiment classification system based on lexical resource expansion and custom approaches building. Int. J. Speech Technol. 2020, 24, 109–126. [Google Scholar] [CrossRef]
- Baly, R.; Badaro, G.; El-Khoury, G.; Moukalled, R.; Aoun, R.; Hajj, H.; El-Hajj, W.; Habash, N.; Shaban, K. A characterization study of arabic twitter data with a benchmarking for state-of-the-art opinion mining models. In Proceedings of the Third Arabic Natural Language Processing Workshop, Valencia, Spain, 3 April 2017; pp. 110–118. [Google Scholar]
- Cliche, M. BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs. arXiv 2017, arXiv:1704.06125. [Google Scholar]
- Al Sallab, A.; Hajj, H.; Badaro, G.; Baly, R.; El-Hajj, W.; Shaban, K. Deep learning models for sentiment analysis in Arabic. In Proceedings of the Second Workshop on Arabic Natural Language Processing, Beijing, China, 30 July 2015; pp. 9–17. [Google Scholar]
- Mohammed, A.; Kora, R. Deep learning approaches for Arabic sentiment analysis. Soc. Netw. Anal. Min. 2019, 9, 1–12. [Google Scholar] [CrossRef]
- Omara, E.; Mosa, M.; Ismail, N. Deep convolutional network for arabic sentiment analysis. In Proceedings of the 2018 International Japan-Africa Conference on Electronics, Communications and Computations (JAC-ECC), Alexandria, Egypt, 16–18 December 2018; pp. 155–159. [Google Scholar]
- Chowdhury, S.A.; Abdelali, A.; Darwish, K.; Soon-Gyo, J.; Salminen, J.; Jansen, B.J. Improving Arabic text categorization using transformer training diversification. In Proceedings of the Fifth Arabic Natural Language Processing Workshop, Barcelona, Spain, 12 December 2020; pp. 226–236. [Google Scholar]
- Farha, I.A.; Magdy, W. Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kyiv, Ukraine, 19 April 2021; pp. 21–31. [Google Scholar]
- Abuzayed, A.; Al-Khalifa, H. Sarcasm and Sentiment Detection In Arabic Tweets Using BERT-based Models and Data Augmentation. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kyiv, Ukraine, 19 April 2021; pp. 312–317. [Google Scholar]
- Abdul-Mageed, M.; Elmadany, A.; Nagoudi, E.M.B. ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic. arXiv 2020, arXiv:2101.01785. [Google Scholar]
Ref. | Year | Objectives | Techniques | Machine Learning | Arabic Tweets |
---|---|---|---|---|---|
Alhumoud et al. [20] | 2015 | Tools and techniques for Arabic data analysis | Discuss different Machine learning approaches used for SA | ✔✔ | NA |
Al-Ayyoub et al. [22] | 2019 | Comprehensive analysis of general Arabic sentiment analysis | Aspect-Based SA, Binary SA, Turnary SA, Multi-Way SA, Aspect-Based SA, Multilingual SA | ✔✔ | Partially |
Guellil et al. [23] | 2019 | Surveyed different Arabic varieties i.e., classical Arabic, Modern Standard Arabic, Arabic Dialect | text mining and machine learning techniques. | ✔✔ | Partially |
Badaro et al. [24] | 2019 | Different tools, resources, and techniques, for Arabic text analysis | Machine learning and others | ✔✔ | Partially |
El-Masri et al. [27] | 2017 | Presenting Deep learning techniques used in different applications of Arabic SA | Deep learning | Partially | Partially |
Ghallab et al. [30] | 2020 | Journal and conference based classification | ML, DL, hybrid | ✔✔ | Partially |
Abo et al. [31] | 2020 | Survey on Arabic text processing | ML, hybrid, lexicon | ✔✔ | Partially |
This survey | 2021 | Classifications of Arabic Tweets | Machine Learning, hybrid | ✔✔ | ✔✔ |
Author | Technique Used | Data Set | Accuracy | Strengths | Weaknesses |
---|---|---|---|---|---|
Duwari et al. [71] | NB, KNN, SVM | 75.25%. | 10 fold cross validation is used with multiple Classifiers | The data set was very small, and accuracy was low. | |
Atoum et al. [72] | SVM, NB | 82.1% | Balanced and unbalance data set is used for all models | Accuracy is not good without light stemming. | |
Al-Horaibi et al. [74] | NB, Decision Tree | 64.84% | Data set was annotated by two native Arabic annotators | The Size of tweets were small | |
Abdelaal et al. [75] | ensemble of surface features | 88.6% | The bagging, boosting and stacking are used for improving accuracy | It takes more time for training and single data set is used for all algorithms. | |
Jardaneh et al. [73] | Ada Boost, RF, DT | 76% | Both content-based and user-based features used | Accuracy was low. | |
Alasand et al. [76] | (DMNB) | 88.67% | 10-fold cross-validation approach is used | Data set was small | |
Duwairi et al. [77] | NB, SVM, and KNN | 69.97% | Weighting schemes like TF, TF-IDF, and BoW were used | Accuracy of the system was not good | |
Ismail et al. [78] | NB, SVM, and KNN | Sudanese Arabic dialect corpus | 92% | Tweets were manually labeled by 3 Arabic speakers. | Huge amount of time is required for manual labeling |
Alsaleem [79] | NB, SVM | SNP Arabic | 77.9% | Different Arabic data sets were used for training | Classification accuracy was not too good |
Salamah et al. [80] | Decision Tree and SVM | Kuaweti language | 76% | Manual annotation of Data set | The data set was too big so it increase training time. |
Al-Osaimi et al. [81] | NB and Decision tree | 63.79%. | Detect emotion icons in the tweets | The accuracy of system was low. | |
Abdul-Mageed et al. [82] | SVM | 3015 tweets | 69% | Testing with TAGREED corpus | Accuracy was not good |
Amira et al. [83] | SVM and NB | NA | Uni-grams and combination of Uni-grams and Bi-grams were used | Neutral category was not considered data set very small | |
Harrag et al. [86] | Decision Tree | Hadith | 93% | Two different data sets were used | Low Accuracy on Hadith corpus |
Helmy et al. [91] | SVM, BPM | Hadith | 96% | Models were used without Machine Translation | Accuracy of BPM was low |
Motaz et al. [87] | Decision tree | Al-jazeera news | 94% | Combination of preprocessing techniques were used | Data set was small |
Rasheed et al. [88] | Decision tree, SVM and NB | Arabic YouTube pages | 94.5% | Similarity and sentiment words features were used | Model take more computations |
Author | Technique Used | Data Set | Accuracy | Strengths | Weaknesses |
---|---|---|---|---|---|
Salloum et al. [103] | Unsupervi- sed approach | Custom data set | 92% | DA-English parallel corpus’s was used | Only morphological segmentation of text is considered |
Alzanin et al. [94] | Gaussian Naïve Bayes | 78.6%. | Rumor Detection in Arabic language | The Data set was small. | |
Abuaidah et al. [95] | K-means, | 98% | Five similarity functions were used | Data set was small | |
Mostafa et al. [96] | clustering | 77% | Predefined expert lexicon of 6800 seed adjectives was used | Sarcastic expressions was not detected by the system | |
Sangaiah et al. [97] | K means with dimensionality reduction | Custom data set | 82% | Term weighting method was used | Increase in the reduction ratio damage essential factors |
Alotaibi et al. [104] | K-Nearest Neighbors | News Reviews | 82% | Word clustering is used | Computation is increased by using word clustering |
Abuaiadah et al. [98] | Standard K-means, Bisect K-means algorithms | 76.4% | Root based stemming is used that requires less memory usage | Data set was small and accuracy was not too good. | |
Elarnaoty et al. [99] | CRF | Arabic news text | 85.52 | Sequential tagging is used | Arabic Opinion Holder Extraction task performance can be improved by robust Arabic lexical parser |
Oraby et al. [100] | Rule-based | Arabic movie reviews | N/A | Basic decomposition and modeling of the Arabic grammatical structure. | Small data set |
Author | Technique | Data Set | Accuracy | Strengths | Weaknesses |
---|---|---|---|---|---|
Aldayal et al. [108] | Semantic orientation + ML | 84.01% | Basic features of the Arabic language were discussed. | Data set was small | |
Thabtah et al. [109] | RIPPER + PART | 83% | The study blends semantic orientation and ML | The training time of the system was high. | |
Elshakankery et al. [110] | Lexicon + ML technique. | Multiple Data sets | 84.6% | Combines both lexical based and machine learning models | The system uses huge amount of computation |
Shaalan et al. [111] | Rule based + ML | 90% | Addresses the bottleneck of language | Data set was small | |
Hadni et al. [112] | NB + SVM | Kalimat Corpus | 94.4% | Three well-known Stemmers were used Tagging | Accuracy of unknown words were low |
AL-SAQQA et al. [113] | Four Combinations of SVM, NB and DT | reviews, Tweets | 91% | Balanced Data set was used | Neutral category was not considered |
Altaher et al. [114] | Deep Learning techniques | 90% | Deep learning with weighting characteristics were used | The semantic of Arabic tweets was not considered | |
Biltawi et al. [115] | Dictionary + Corpus based techniques. | Movie Reviews | 96.34% | Two separate data sets were used for testing | The lexicon generation takes huge amount of time. |
Alhumoud et al. [116] | Custom | 90% | Data of three Domains, i.e., sports, social, and political were used | Low accuracy on sports data | |
Salloum et al. [117] | Custom | Facebook, Newspapers | 80% | Provides good analysis of Newspapers and Facebook data using ML algorithms. | Data set was very small. |
Elzayady et al. [120] | Deep learning models | Education and Politics | 80% | Various models were built along with hybrid model | Bidirectional LSTMs can produce good results. |
El-Makky et al. [118] | Custom | 84% | The Arabic lexicon was formed by combining two Modern and two Egyptian Arabic lexicons. | Small data set was used in this study | |
Khalifa et al. [119] | Lexicon + NB | Reviews on Jordan hotels | 91% | Various supervised models were built with lexicon model | Data set was not manually annotated and no standard data set was used |
Author | Technique | Data Set | Accuracy | Strengths | Weaknesses |
---|---|---|---|---|---|
Alsmadi et al. [132] | Dictionary based | 86.89% | Predicate calculus was used for text classification | Data set was too big that take long time. | |
Abdulla et al. [125] | Dictionary and Corpus based | Custom | 84.4% | Both techniques were used | Data set was small. |
Hamed et al. [127] | Dictionary based | Algerian Arabic Corpus | 78.13% | Three different lexicon were used | Accuracy was low. |
Duwairi et al. [128] | Corpus based | 70% | Senti-Strength English sentiment was used for translating tweets | Low accuracy level. | |
Abdullah et al. [125] | Clustering | Manual data set | 87 % | Corpus and lexicon | No standard data set was used. |
Badaro et al. [129] | Corpus based | 67.3% | Mobile App was built | Low accuracy level | |
Alayba et al. [133] | Word embeddings | Al-Khair Corpus | 92% | Various word2vec Models were built | High computation because of very large data set |
Abdulla et al. [131] | Dictionary based | 74.6% | Various novel features i.e., negation and intensification were used | Accuracy was low | |
Kabi et al. [134] | Dictionary based | Twitter, Yahoo-Maktoob | 70.05% | Data was analysed on Both Light stemming and No stemming | Accracy on Yahoo carpus were very low i.e 63% |
Hossam et al. [135] | Corpus based | Arabic Tweets/ Product reviews/ Hotel Reviews | 95% | Various data sets were used | System was not able to detect sarcasm and some idioms |
Mohammad et al. [136] | Corpus based | 66.57% | Arabic text was translated and then classified | Low accuracy score | |
Al-Ayyoub et al. [126] | Dictionary based | 86.89% | Large lexicon of words was used | The approach did not handle domain specific issues |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alruily, M. Classification of Arabic Tweets: A Review. Electronics 2021, 10, 1143. https://doi.org/10.3390/electronics10101143
Alruily M. Classification of Arabic Tweets: A Review. Electronics. 2021; 10(10):1143. https://doi.org/10.3390/electronics10101143
Chicago/Turabian StyleAlruily, Meshrif. 2021. "Classification of Arabic Tweets: A Review" Electronics 10, no. 10: 1143. https://doi.org/10.3390/electronics10101143
APA StyleAlruily, M. (2021). Classification of Arabic Tweets: A Review. Electronics, 10(10), 1143. https://doi.org/10.3390/electronics10101143