An Empirical Study on Customer Churn Behaviours Prediction Using Arabic Twitter Mining Approach
Abstract
:1. Introduction
- The current churn prediction models have a relatively short life, as they rely on the customers’ historical data. The data become less valuable over time for making predictions [6], which may not provide telecom companies with the best churn prediction experience.
- There is a lack of research that integrates a structural data framework with real-time analytics to target customers in real-time [7].
- The current churn prediction models exclude location and language factors and that causes geographical and cultural sampling errors [8].
- It is the first work using Twitter mining to predict potential customer loss (churn) in Saudi telecom companies.
- It identifyies and evaluates the main gaps in the current churn prediction models.
- It proposes and evaluates a novel design of a churn prediction model to address the gaps in current churn prediction models by providing a real-time method that suits the telecom data and the Arabic data set.
- It contributes to the Arabic sentiment analysis (ASA) research community, by using the latest cutting-edge techniques to perform new experiments with the above relatively new, unexplored and extensive Arabic dialect dataset.
2. Related Research
2.1. Predicting Customer Churn and Data Mining Techniques
2.2. New Customer Churn Model Variables
Ref | Dataset | Algorithm | Results/Future Work |
---|---|---|---|
[41] | Customer data from SyriaTel telecom company. | Decision tree, random forest, gradient boosted machine tree and extreme gradient boosting (XGBOOST). | The best results were obtained by applying the XGBOOST algorithm with 93.3% the area under the curve (AUC) value. |
[7] | The unstructured data included: (1) details of customer complaints and feedback. (2) data records captured, such as data regarding purchase, download of apps, etc. | RFM technique | They recommended the integration of the structure data framework with real-time analytics to target customers in real time on the bases of location, time, etc. |
[42] | Available historical records extracted from the telecom industry. | Logistic regression and decision trees in R. | The data mining techniques could be a promising solution for customer churn management. |
[43] | Two telecom industry datasets were considered. Type-1 contained 3333 records, and Type-2 contained 20,468 records. | Axiomatic fuzzy set theory and parallel density-based spatial clustering on the Hadoop MapReduce framework. | The proposed model is more efficient than the existing system in terms of time and performance. |
[2] | Online available customers dataset at Kaggle https://www.kaggle.com/ (accessed on 23 June 2021). | Used different classifiers implemented in WEKA, | Summed up their findings with a conclusion that bagging, and the SMO algorithm outperform with an accuracy of 99.8% using 14 attributes. |
[3] | A total of 153,651 distinct tweets for the Twitter handles of five popular telecom brands in India. | Semantic analysis. | Proved that sentiment analysis can manage the higher growth rate of new subscribers who were added to the brand in the study period. |
[28] | Tweets related to Telkom’s broadband internet service and customer churn rate data history from the company’s data warehouse. | Applied sentiment analysis using recurrent neural network LSTM. | Results indicated that the accuracy of the churn rate predictions (based on the previous three months) are correlated with negative moods. |
[8] | Used the Pokec social network data and generated synthetic call log details of 25,000 users | Used influence maximisation | Future analysis should factor in both location and language to avoid geographical and cultural sampling errors. |
Num. | Customer Churn Variables | Description | Type of Variable | Range |
---|---|---|---|---|
1 | Age | Age group has been identified | Ordinal variable | 18–24, ‘1’ |
25–34, ‘2’ | ||||
35–44, ‘3’ | ||||
45–54, ‘4’ | ||||
55–64, ‘5’ | ||||
65+, ‘6’ | ||||
2 | Gender | Male or Female | Binary variable | Male, ‘0’ |
Female, ‘1’ | ||||
3 | Has a relation at the same telecom company | Does the customer have a family member who used the same telecom provider as he/she did? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
4 | Overdue bill | Does the customer have an unpaid bill? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
5 | Long period | Contract length in month from start day of contract until June 2017 | Ordinal variable | ≥1, ‘1’ |
1 ≥ 5, ‘2’ | ||||
5 ≥ 10, ‘3’ | ||||
10+, ‘4’ | ||||
6 | New customer | Has the customer used a telecom provider recently? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
7 | Inactive | Is the customer active? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
8 | Low data | Does the customer have low data usage? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
9 | Low talk | Does the customer make few phone calls? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
10 | No Internet and talk and SMS | Does the customer not use the Internet, phone calls and short message service? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
11 | No value-added service | Does the customer use any of the non-core services? | Binary variable | Yes, ‘1’ |
No, ‘0’ | ||||
12 | Customer satisfaction | Percentage of customer satisfaction from Twitter analysis [44] | Continuous variable | |
13 | Churn status | Does the customer churn? | Binary variable | Churner/Non-churner Churner, ‘1’ |
Non-churner, ‘0’ |
3. Methodology
3.1. Data Set Construction
3.2. Customer Satisfaction Rate
3.3. Dataset Cleaning, Pre-Processing and Annotation
3.4. Using the Model to Measure the Customer Satisfaction Rate
3.5. Historical Data Set Preparation
3.6. Modelling
- FP: indicates that our model predicts the customer is a churner but the customer is a non-churner.
- FN: indicates that our model predicts the customer is a non-churner but the customer is a churner.
- TP: indicates our model correctly predicts the customer is a churner.
- TN: indicates our model correctly predicts the customer is a non-churner.
4. Results and Discussion
4.1. Training the SentiChurn Model
4.2. Evaluating the Model
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, P.; Bi, T.; Liu, Y.; Li, S. Telecom Customer Churn Prediction Method Based on Cluster Stratified Sampling Logistic Regression. In Proceedings of the International Conference on Software Intelligence Technologies and Applications & International Conference on Frontiers of Internet of Things 2014, Institution of Engineering and Technology (IET), Hsinchu, Taiwan, 4–6 December 2014; pp. 282–287. [Google Scholar]
- Ali, M.; Rehman, A.U.; Hafeez, S.; Ashraf, M.U. Prediction of Churning Behavior of Customers in Telecom Sector Using Supervised Learning Techniques. In Proceedings of the 2018 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE), IEEE, Khartoum, Sudan, 12–14 August 2018. [Google Scholar]
- Ranjan, S.; Sood, S.; Verma, V. Twitter Sentiment Analysis of Real-Time Customer Experience Feedback for Predicting Growth of Indian Telecom Companies. In Proceedings of the 2018 4th International Conference on Computing Sciences (ICCS), Institute of Electrical and Electronics Engineers (IEEE), Jalandhar, India, 30–31 August 2018; pp. 166–174. [Google Scholar]
- Bhatnagar, V. Data Mining and Analysis in the Engineering Field; IGI Global: Hershey, PA, USA, 2014. [Google Scholar]
- Santharam, A.; Krishnan, S.B. Survey on Customer Churn Prediction Techniques. Int. Res. J. Eng. Technol. 2018, 5, 3. [Google Scholar]
- Hassouna, M.; Tarhini, A.; Elyas, T.; Trab, M.S.A. Customer Churn in Mobile Markets: A Comparison of Techniques. Int. Bus. Res. 2015, 8, p224. [Google Scholar] [CrossRef] [Green Version]
- Singh, I.; Singh, S. Framework for Targeting High Value Customers and Potential Churn Customers in Telecom using Big Data Analytics. Int. J. Educ. Manag. Eng. 2017, 7, 36–45. [Google Scholar] [CrossRef]
- Pagare, R.; Khare, A. Churn prediction by finding most influential nodes in social network. In Proceedings of the 2016 International Conference on Computing, Analytics and Security Trends (CAST), Institute of Electrical and Electronics Engineers (IEEE), Pune, India, 19–21 December 2016; pp. 68–71. [Google Scholar]
- Mostafa, M.M. More than words: Social networks’ text mining for consumer brand sentiments. Expert Syst. Appl. 2013, 40, 4241–4251. [Google Scholar] [CrossRef]
- Marcus, A.; Bernstein, M.S.; Badar, O.; Karger, D.R.; Madden, S.; Miller, R.C. Processing and visualizing the data in tweets. ACM SIGMOD Rec. 2012, 40, 21–27. [Google Scholar] [CrossRef] [Green Version]
- Castronovo, C.; Huang, L. Social media in an alternative marketing communication model. J. Mark. Develop. Compet. 2012, 6, 117–134. [Google Scholar]
- Mahyoub, F.; Siddiqui, M.A.; Dahab, M.Y. Building an Arabic Sentiment Lexicon Using Semi-supervised Learning. J. King Saud. Univ. Comput. Inf. Sci. 2014, 26, 417–424. [Google Scholar] [CrossRef] [Green Version]
- Al-Saggaf, Y.; Simmons, P. Social media in Saudi Arabia: Exploring its use during two natural disasters. Technol. Forecast. Soc. Chang. 2015, 95, 3–15. [Google Scholar] [CrossRef]
- Mourtada, R.; Salem, F. Citizen Engagement and Public Services in the Arab World: The Potential of Social Media; Arab Social Media Report Series; Elsevier: Amsterdam, The Netherlands, 2014. [Google Scholar]
- Duwairi, R.M.; Qarqaz, I. Arabic sentiment analysis using supervised classification. In Proceedings of the 2014 International Conference on Future Internet of Things and Cloud, IEEE, Barcelona, Spain, 27–29 August 2014. [Google Scholar]
- Al-Twairesh, N. Sentiment Analysis of Twitter: A Study on the Saudi Community. Ph.D. Thesis, King Saud University, Riyadh, Saudi Arabia, 2016. [Google Scholar]
- Syiam, M.M.; Fayed, Z.T.; Habib, M. An intelligent system for Arabic text categorization. Int. J. Intell. Comput. Inf. Sci. 2006, 6, 1–19. [Google Scholar]
- Masmoudi, A.; Khmekhem, M.E.; Esteve, Y.; Belguith, L.H.; Habash, N. A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition. In Proceedings of the LREC, Reykjavik, Iceland, 26–31 May 2014. [Google Scholar]
- Farghaly, A.; Shaalan, K. Arabic natural language processing: Challenges and solutions. ACM Trans. Asian Lang. Inf. Process. 2009, 8, 14. [Google Scholar] [CrossRef]
- Al-Twairesh, N.; Al-Khalifa, H.; Al-Salman, A.; Al-Ohali, Y. Arasenti-tweet: A corpus for arabic sentiment analysis of saudi tweets. Proced. Comput. Sci. 2017, 117, 63–72. [Google Scholar] [CrossRef]
- Al-Twairesh, N.; Al-Khalifa, H.; Alsalman, A.; Al-Ohali, Y. Sentiment analysis of arabic tweets: Feature engineering and a hybrid approach. arXiv 2018, arXiv:1805.08533. [Google Scholar]
- Asaari, M.; Karia, N. Business Strategy: Customer Satisfaction among Cellular Providers in Malaysia. In Proceedings of the European Applied Business Research Conference Proc., Venice, Italy, 9 June 2003. [Google Scholar]
- Yoo, D.-K.; Suh, S.-W. The effect of medical service quality and perceived risk on customer satisfaction, repurchase intention, and churn intention as to hospital sizes. Korea Serv. Manag. Soc. 2009, 10, 97–130. [Google Scholar]
- Hassan, R.S.; Nawaz, A.; Lashari, M.N.; Zafar, F. Effect of Customer Relationship Management on Customer Satisfaction. Procedia Econ. Financ. 2015, 23, 563–567. [Google Scholar] [CrossRef] [Green Version]
- Chen, Y.-B.; Li, B.-S.; Ge, X.-Q.; Yu-Bao, C.; Bao-Sheng, L.; Xin-Quan, G. Study on Predictive Model of Customer Churn of Mobile Telecommunication Company. In Proceedings of the 2011 Fourth International Conference on Business Intelligence and Financial Engineering, Institute of Electrical and Electronics Engineers (IEEE), Wuhan, China, 17–18 October 2011; pp. 114–117. [Google Scholar]
- Kentrias, S. Customer relationship management: The SAS perspective. Retriev. Mar. 2001, 24, 2011. [Google Scholar]
- Mahajan, V.; Mahajan, R. Variable Selection of Customers for Churn Analysis in Telecommunication Industry. Int. J. Virtual Communities Soc. Netw. (IJVCSN) 2018, 10, 17–32. [Google Scholar] [CrossRef]
- Napitu, F.; Bijaksana, M.A.; Trisetyarso, A.; Heryadi, Y. Twitter opinion mining predicts broadband internet’s customer churn rate. In Proceedings of the 2017 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), IEEE, Phuket, Thailand, 20–22 November 2017. [Google Scholar]
- Oyeniyi, A.; Adeyemo, A.; Oyeniyi, A.; Adeyemo, A. Customer churn analysis in banking sector using data mining techniques. Afr. J. Comput. ICT 2015, 8, 165–174. [Google Scholar]
- Coussement, K.; Lessmann, S.; Verstraeten, G. A comparative analysis of data preparation algorithms for customer churn prediction: A case study in the telecommunication industry. Decis. Support Syst. 2017, 95, 27–36. [Google Scholar] [CrossRef]
- Forhad, N.; Hussain, S.; Rahman, R.M. Churn analysis: Predicting churners. In Proceedings of the Ninth International Conference on Digital Information Management (ICDIM 2014), Institute of Electrical and Electronics Engineers (IEEE), Phitsanulok, Thailand, 29 September–1 October 2014; pp. 237–241. [Google Scholar]
- Mohanty, R.; Rani, K.J. Application of Computational Intelligence to Predict Churn and Non-Churn of Customers in Indian Telecommunication. In Proceedings of the 2015 International Conference on Computational Intelligence and Communication Networks (CICN), Institute of Electrical and Electronics Engineers (IEEE), Jabalpur, India, 12–14 December 2015; pp. 598–603. [Google Scholar]
- Hung, S.-Y.; Yen, D.C.; Wang, H.-Y. Applying data mining to telecom churn management. Expert Syst. Appl. 2006, 31, 515–524. [Google Scholar] [CrossRef] [Green Version]
- Olle, G.D.O.; Cai, S. A hybrid churn prediction model in mobile telecommunication industry. Int. J. e-Edu. e-Bus. e-Manag. e-Learn. 2014, 4, 55. [Google Scholar] [CrossRef] [Green Version]
- Shaaban, E.; Helmy, Y.; Khedr, A.; Nasr, M. A proposed churn prediction model. Int. J. Eng. Res. Appl. 2012, 2, 693–697. [Google Scholar]
- Balasubramanian, D.M.; Selvarani, M. Churn Prediction in Mobile Telecom System Using Data Mining Techniques. Int. J. Sci. Res. Publ. 2014, 4, 1–5. [Google Scholar]
- Haenlein, M. Social interactions in customer churn decisions: The impact of relationship directionality. Int. J. Res. Mark. 2013, 30, 236–248. [Google Scholar] [CrossRef]
- Oskarsdottir, M.; Bravo, C.; Verbeke, W.; Sarraute, C.; Baesens, B.; Vanthienen, J. A comparative study of social network classifiers for predicting churn in the telecommunication industry. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Institute of Electrical and Electronics Engineers (IEEE), San Francisco, CA, USA, 18–21 August 2016; pp. 1151–1158. [Google Scholar]
- Saraswat, S.; Tiwari, A. A New Approach for Customer Churn Prediction in Telecom Industry. Int. J. Comput. Appl. 2018, 181, 40–46. [Google Scholar] [CrossRef]
- Verbeke, W.; Martens, D.; Baesens, B. Social network analysis for customer churn prediction. Appl. Soft Comput. 2014, 14, 431–446. [Google Scholar] [CrossRef]
- Ahmad, A.K.; Jafar, A.; Aljoumaa, K. Customer churn prediction in telecom using machine learning in big data platform. J. Big Data 2019, 6, 28. [Google Scholar] [CrossRef] [Green Version]
- Dalvi, P.K.; Khandge, S.K.; Deomore, A.; Bankar, A.; Kanade, V.A. Analysis of customer churn prediction in telecom industry using decision trees and logistic regression. In Proceedings of the 2016 Symposium on Colossal Data Analysis and Networking (CDAN), Institute of Electrical and Electronics Engineers (IEEE), Indore, India, 18–19 March 2016; pp. 1–4. [Google Scholar]
- Dulhare, U.N.; Ghori, I. An efficient hybrid clustering to predict the risk of customer churn. In Proceedings of the 2018 2nd International Conference on Inventive Systems and Control (ICISC), Institute of Electrical and Electronics Engineers (IEEE), Coimbatore, India, 19–20 January 2018; pp. 673–677. [Google Scholar]
- Almuqren, L.A.; Moh’d Qasem, M.; Cristea, A.I. Using Deep Learning Networks to Predict Telecom Company Customer Satisfaction Based on Arabic Tweets; ISD: Tolerance, France, 2019. [Google Scholar]
- Keramati, A.; Jafari-Marandi, R.; Aliannejadi, M.; Ahmadian, I.; Mozaffari, M.; Abbasi, U. Improved churn prediction in telecommunication industry using data mining techniques. Appl. Soft Comput. 2014, 24, 994–1012. [Google Scholar] [CrossRef]
- Tiwari, A.; Sam, R.; Shaikh, S. Analysis and prediction of churn customers for telecommunication industry. In Proceedings of the 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Institute of Electrical and Electronics Engineers (IEEE), Palladam, India, 10–11 February 2017; pp. 218–222. [Google Scholar]
- Amin, A.; Anwar, S.; Adnan, A.; Nawaz, M.; Alawfi, K.; Hussain, A.; Huang, K. Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing 2017, 237, 242–254. [Google Scholar] [CrossRef]
- Hudaib, A.; Dannoun, R.; Harfoushi, O.; Obiedat, R.; Faris, H. Hybrid Data Mining Models for Predicting Customer Churn. Int. J. Commun. Netw. Syst. Sci. 2015, 8, 91–96. [Google Scholar] [CrossRef] [Green Version]
- Wei, C.-P.; Chiu, I.-T. Turning telecommunications call details to churn prediction: A data mining approach. Expert Syst. Appl. 2002, 23, 103–112. [Google Scholar] [CrossRef]
- Sonia, S.E.; Rajakumar, S.B.; Nalini, C. Churn Prediction using MAPREDUCE. Int. J. Sci. Eng. Technol. 2014, 3, 597–600. [Google Scholar]
- Burez, J.; Poel, D.V.D. CRM at a pay-TV company: Using analytical models to reduce customer attrition by targeted marketing for subscription services. Expert Syst. Appl. 2007, 32, 277–288. [Google Scholar] [CrossRef] [Green Version]
- Bin, L.; Peiji, S.; Juan, L. Customer churn prediction based on the decision tree in personal handyphone system service. In Proceedings of the 2007 International Conference on Service Systems and Service Management, IEEE, Chengdu, China, 9–11 June 2007. [Google Scholar]
- Bakır, B.; Batmaz, I.; Güntürkün, F.; İpekçi, İ.; Köksal, G.; Özdemirel, N. Defect cause modeling with decision tree and regression analysis. World Acad. Sci. Eng. Technol. 2006, 24, 1–4. [Google Scholar]
- Dahiya, K.; Talwar, K. Customer churn prediction in telecommunication industries using data mining techniques-a review. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2015, 5, 417–433. [Google Scholar]
- Gürsoy, U.Ş. Customer churn analysis in telecommunication sector. İstanbul. Üniversitesi. İşletme. Fakültesi. Dergisi. 2010, 39, 35–49. [Google Scholar]
- Chu, B.-H.; Tsai, M.-S.; Ho, C.-S. Toward a hybrid data mining model for customer retention. Knowl. Based Syst. 2007, 20, 703–718. [Google Scholar] [CrossRef]
- Vafeiadis, T.; Diamantaras, K.; Sarigiannidis, G.; Chatzisavvas, K. A comparison of machine learning techniques for customer churn prediction. Simul. Model. Pr. Theory 2015, 55, 1–9. [Google Scholar] [CrossRef]
- Chathuranga, L.; Rathnayaka, R.; Arumawadu, H. New Customer Churn Prediction Model for Mobile Telecommunication Industry. In Proceedings of the 11th International Research Conference 2018, Galle, Sri Lanka, 7 December 2018. [Google Scholar]
- Qureshi, S.A.; Rehman, A.S.; Qamar, A.M.; Kamal, A.; Rehman, A. Telecommunication subscribers’ churn prediction model using machine learning. In Proceedings of the Eighth International Conference on Digital Information Management (ICDIM 2013), Institute of Electrical and Electronics Engineers (IEEE), Islamabad, Pakistan, 10–12 September 2013; pp. 131–136. [Google Scholar]
- Lazarov, V.; Capota, M. Churn prediction. Bus. Anal. Course TUM Comput. Sci. 2007, 33, 34. [Google Scholar]
- Binti Oseman, K.; Haris, N.A.; bin Abu Bakar, F. Data mining in churn analysis model for telecommunication industry. J. Stat. Model. Anal. 2010, 1, 19–27. [Google Scholar]
- Hadden, J.; Tiwari, A.; Roy, R.; Ruta, D. Churn prediction: Does technology matter. Int. J. Intel. Technol. 2006, 1, 104–110. [Google Scholar]
- Larivière, B.; Van den Poel, D. Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Syst. Appl. 2005, 29, 472–484. [Google Scholar] [CrossRef]
- Lu, N.; Lin, H.; Lu, J.; Zhang, G. A Customer Churn Prediction Model in Telecom Industry Using Boosting. IEEE Trans. Ind. Inform. 2012, 10, 1659–1665. [Google Scholar] [CrossRef]
- Brandusoiu, I.; Toderean, G. Churn Prediction in the Telecommunications Sector Using Support Vector Machines. Ann. ORADEA Univ. Fascicle Manag. Technol. Eng. 2013, XXII (XII), 1. [Google Scholar] [CrossRef]
- Glady, N.; Baesens, B.; Croux, C. Modeling churn using customer lifetime value. Eur. J. Oper. Res. 2009, 197, 402–411. [Google Scholar] [CrossRef]
- Tsai, C.-F.; Lu, Y.-H. Customer churn prediction by hybrid neural networks. Expert Syst. Appl. 2009, 36, 12547–12553. [Google Scholar] [CrossRef]
- Xia, G.-E.; Jin, W.-D. Model of Customer Churn Prediction on Support Vector Machine. Syst. Eng. Theory Pract. 2008, 28, 71–77. [Google Scholar] [CrossRef]
- Tsai, C.-F.; Chen, M.-Y. Variable selection by association rules for customer churn prediction of multimedia on demand. Expert Syst. Appl. 2010, 37, 2006–2015. [Google Scholar] [CrossRef]
- Xie, Y.; Li, X.; Ngai, E.W.; Ying, W. Customer churn prediction using improved balanced random forests. Expert Syst. Appl. 2009, 36, 5445–5449. [Google Scholar] [CrossRef]
- Hassouna, M.; Arzoky, M. Agent based modelling and simulation: Toward a new model of customer retention in the mobile market. In Proceedings of the 2011 Summer Computer Simulation Conference, Vista, CA, USA, 27 June 2011. [Google Scholar]
- Stovel, M.; Bontis, N. Voluntary turnover: Knowledge management—friend or foe? J. Intellect. Cap. 2002, 3, 303–322. [Google Scholar] [CrossRef] [Green Version]
- Leech, G. Corpus Annotation Schemes. Lit. Linguistic Comput. 1993, 8, 275–281. [Google Scholar] [CrossRef]
- Saudi Information Technology Commission. Available online: https://ictind.citc.gov.sa/extensions/ICTPublicReports/Ar/indicator_mobtelservices_byyear_ar.html (accessed on 11 August 2019).
- Wu, X.; Zhu, X.; Wu, G.-Q.; Ding, W. Data mining with big data. IEEE Trans. Knowl. Data Eng. 2013, 26, 97–107. [Google Scholar]
- Brachman, R.J.; Anand, T. The Process of Knowledge Discovery in Databases. Advances in Knowledge Discovery and Data Mining. IEEE Expert 1996, 11, 37–57. [Google Scholar]
- Frawley, W.J.; Piatetsky-Shapiro, G.; Matheus, C.J. Knowledge discovery in databases: An overview. AI Mag. 1992, 13, 57. [Google Scholar]
- Chapman, P.; Clinton, J.; Kerber, R.; Khabaza, T.; Reinartz, T.; Shearer, C.; Wirth, R. CRISP-DM 1.0: Step-by-step data mining guide. SPSS Inc. 2000, 9, 13. [Google Scholar]
- Shafique, U.; Qaiser, H. A comparative study of data mining process models (KDD, CRISP-DM and SEMMA). Int. J. Innov. Sci. Res. 2014, 12, 217–222. [Google Scholar]
- Rudin, C.; Wagstaff, K.L. Machine Learning for Science and Society; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
- Mariscal, G.; Marbán, Ó.; Fernández, C. A survey of data mining and knowledge discovery process models and methodologies. Knowl. Eng. Rev. 2010, 25, 137–166. [Google Scholar] [CrossRef]
- Li, H.; Yang, D.; Yang, L.; Lu, Y.; Lin, X. Supervised Massive Data Analysis for Telecommunication Customer Churn Prediction. In Proceedings of the 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom), Institute of Electrical and Electronics Engineers (IEEE), Atlanta, GA, USA, 8–10 October 2016; pp. 163–169. [Google Scholar]
- Howard, J.; Ruder, S. Universal language model fine-tuning for text classification. arXiv 2018, arXiv:1801.06146. [Google Scholar]
- Almuqren, L.; Cristea, A. AraCust: A Saudi Telecom Tweets corpus for sentiment analysis. PeerJ Comput. Sci. 2021, 7, e510. [Google Scholar] [CrossRef]
- Refaee, E.; Rieser, V. An arabic twitter corpus for subjectivity and sentiment analysis. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Reykjavik, Iceland, 26–31 May 2014. [Google Scholar]
- Thelwall, M.; Buckley, K.; Paltoglou, G. Sentiment strength detection for the social web. J. Am. Soc. Inf. Sci. Technol. 2011, 63, 163–173. [Google Scholar] [CrossRef] [Green Version]
- Mourad, A.; Darwish, K. Subjectivity and sentiment analysis of modern standard Arabic and Arabic microblogs. In Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, Atlanta, GA, USA, 14 June 2013. [Google Scholar]
- Refaee, E.; Rieser, V. iLab-Edinburgh at SemEval-2016 Task 7: A hybrid approach for determining sentiment intensity of Arabic Twitter phrases. In Proceedings of the 10th international workshop on semantic evaluation (SEMEVAL-2016), San Diego, CA, USA, 16–17 June 2016. [Google Scholar]
- Abdul-Mageed, M.; Diab, M.; Diab, S. SAMAR: Subjectivity and sentiment analysis for Arabic social media. Comput. Speech Lang. 2014, 28, 20–37. [Google Scholar] [CrossRef]
- Baly, F.; Hajj, H. AraBERT: Transformer-based Model for Arabic Language Understanding. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, Marseille, France, 28 February 2020. [Google Scholar]
- ElJundi, O.; Antoun, W.; El Droubi, N.; Hajj, H.; El-Hajj, W.; Shaban, K. hULMonA: The Universal Language Model in Arabic. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, 1 August 2019. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Karahoca, A.; Karahoca, D.; Aydin, N. GSM Churn Management Using an Adaptive Neuro-Fuzzy Inference System. In Proceedings of the 2007 International Conference on Intelligent Pervasive Computing (IPC 2007), Institute of Electrical and Electronics Engineers (IEEE), Jeju, Korea, 11–13 October 2007; pp. 323–326. [Google Scholar]
- Almuqren, L.; Cristea, A. Bi-GRU Arabic Sentiment Analysis Based on AraBERT.; Durham University: Durham, UK, 2021. [Google Scholar]
Techniques | References |
---|---|
Decision tree | [6,25,27,35,36,41,42,54,55,56,57,58,59,60,61,62] |
Logistic regression | [1,6,30,34,42,54,55,57,58,59,60,62,63,64] |
Neural network | [28,32,34,35,36,54,57,58,59,60,62,65,66,67] |
Support vector machine | [2,35,57,65,68,69] |
J48 | [2,32] |
CART | [2,32] |
Naïve Bayes | [46,57,58,60,65] |
Fuzzy classification | [32] |
Rule-based classification | [31] |
k-means algorithm | [35,59] |
Random forest | [41,70] |
Company | Twitter Handle | # of Unique Tweets |
---|---|---|
STC | @STC_KSA, @STCcare | 7590 |
Mobily | @Mobily, @Mobily1100 | 6460 |
Zain | @ZainKSA, @ZainHelpSA | 5950 |
Total | 20,000 |
Tweet in Arabic | Label | Company | Tweet in English |
---|---|---|---|
@STCcare غيري الشركه | Negative | STC | Change the Company @STCcare |
@GOclub @Mobilyاشكر 😊 | Positive | Mobily | Thank you @GOclub @Mobily 😊 |
Tweet in Arabic | Label | Company | Tweet in English |
---|---|---|---|
غيري شركه | Negative | STC | Change the Company |
اشكرك | Positive | Mobily | Thank you |
Company | Predicted Customer Satisfaction | Actual Customer Satisfaction |
---|---|---|
STC | 31.06% | 20.1% |
Mobily | 34.25% | 22.89% |
Zain | 32.06% | 22.91% |
Non-Churner | Precision | Recall | F1-Score |
---|---|---|---|
1.00 | 0.94 | 0.97 | |
Churner | 0.87 | 1.00 | 0.93 |
Macro average | 0.93 | 0.97 | 0.95 |
Weighted average | 0.96 | 0.96 | 0.96 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Almuqren, L.; Alrayes, F.S.; Cristea, A.I. An Empirical Study on Customer Churn Behaviours Prediction Using Arabic Twitter Mining Approach. Future Internet 2021, 13, 175. https://doi.org/10.3390/fi13070175
Almuqren L, Alrayes FS, Cristea AI. An Empirical Study on Customer Churn Behaviours Prediction Using Arabic Twitter Mining Approach. Future Internet. 2021; 13(7):175. https://doi.org/10.3390/fi13070175
Chicago/Turabian StyleAlmuqren, Latifah, Fatma S. Alrayes, and Alexandra I. Cristea. 2021. "An Empirical Study on Customer Churn Behaviours Prediction Using Arabic Twitter Mining Approach" Future Internet 13, no. 7: 175. https://doi.org/10.3390/fi13070175
APA StyleAlmuqren, L., Alrayes, F. S., & Cristea, A. I. (2021). An Empirical Study on Customer Churn Behaviours Prediction Using Arabic Twitter Mining Approach. Future Internet, 13(7), 175. https://doi.org/10.3390/fi13070175