Email Campaign Evaluation Based on User and Mail Server Response
Abstract
:1. Introduction
- Based on the analysis of data collected by Freshmail, we proposed sets of features that are sufficient to analyse the technical aspects of sending campaign and transactional emails;
- We developed methods for labelling large data sets automatically;
- We defined artificial neural networks models and proved that they are highly effective (the F1-score was above 0.95 for any used sample) using real data collected by the company.
2. Data Processing and Analysis System
3. Data Sets
3.1. Campaigns
- Eight statistics for the current minute (as shown in Table 1);
- Eight statistics for the previous minute (for the first minute copy of the same values was used);
- Eight statistics for historical campaigns of the same client, e.g., the average number of soft bounced per minute for previous campaigns.
- label1—The campaign should be stopped (label bad) if the number of bad events exceeds 5%.
- label2—The campaign should be stopped if the hard bounced count exceeds 10% or the resign count exceeds 1.7% and the unique click count does not exceed 1.5%.
- label3—The campaign should be stopped if the standardised value (z-score) of bad events is greater than 2.
- label4—The campaign should be stopped if the unique open count does not exceed 2%.
- label5—A method based on examples of bad clients identified by FreshMail.
- If label3 = 1, then label6 = 1;
- If label3 = 0, and label1 = label2 = label4 = label5 = 1, then label6 = 1;
- Otherwise, label6 = 0.
3.2. Transaction Mails
4. Artificial Neural Networks Models
4.1. Campaigns Analysis
4.2. Transaction Emails Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Karim, A.; Azam, S.; Shanmugam, B.; Kannoorpatti, K.; Alazab, M. A Comprehensive Survey for Intelligent Spam Email Detection. IEEE Access 2019, 7, 168261–168295. [Google Scholar] [CrossRef]
- Muneer, A.; Ali, R.; Al-Sharai, A.; Fati, S. A Survey on Phishing Emails Detection Techniques. In Proceedings of the 2021 International Conference on Innovative Computing (ICIC), Lahore, Pakistan, 9–10 November 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Alkhalil, Z.; Hewage, C.; Nawaf, L.; Khan, I. Phishing Attacks: A Recent Comprehensive Study and a New Anatomy. Front. Comput. Sci. 2021, 3, 563060. [Google Scholar] [CrossRef]
- Gupta, B.B.; Arachchilage, N.A.; Psannis, K.E. Defending against phishing attacks: Taxonomy of methods, current issues and future directions. Telecommun. Syst. 2018, 67, 247–267. [Google Scholar] [CrossRef] [Green Version]
- Abrahams, A.; Chaudhary, T.; Deane, J. A multi-industry, longitudinal analysis of the email marketing habits of the largest United States franchise chains. J. Direct Data Digit. Mark. Pract. 2010, 11, 187–197. [Google Scholar] [CrossRef] [Green Version]
- Mostafa, R.; Norizan, M.Y.; Gazi, M.A. Impact of spam advertisement through e-mail: A study to assess the influence of the anti-spam on the e-mail marketing. Afr. J. Bus. Manag. 2010, 4, 2362–2367. [Google Scholar]
- Ahmed, N.; Amin, R.; Aldabbas, H.; Koundal, D.; Alouffi, B.; Shah, T. Machine Learning Techniques for Spam Detection in Email and IoT Platforms: Analysis and Research Challenges. Secur. Commun. Netw. 2022, 2022, 1862888. [Google Scholar] [CrossRef]
- Dada, E.G.; Bassi, J.S.; Chiroma, H.; Abdulhamid, S.M.; Adetunmbi, A.O.; Ajibuwa, O.E. Machine learning for email spam filtering: Review, approaches and open research problems. Heliyon 2019, 5, e01802. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bansal, C.; Sidhu, B. Machine Learning based Hybrid Approach for Email Spam Detection. In Proceedings of the 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 3–4 September 2021; pp. 1–4. [Google Scholar] [CrossRef]
- Dhanaraj, S.; Karthikeyani, V. A study on e-mail image spam filtering techniques. In Proceedings of the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering, Salem, India, 21–22 February 2013; pp. 49–55. [Google Scholar] [CrossRef]
- Nam, S.G.; Jang, Y.; Lee, D.G.; Seo, Y.S. Hybrid Features by Combining Visual and Text Information to Improve Spam Filtering Performance. Electronics 2022, 11, 2053. [Google Scholar] [CrossRef]
- Afzal, S.; Asim, M.; Javed, A.; Beg, M.; Baker, T. URLdeepDetect: A Deep Learning Approach for Detecting Malicious URLs Using Semantic Vector Models. J. Netw. Syst. Manag. 2021, 29, 21. [Google Scholar] [CrossRef]
- Ozcan, A.; Catal, C.; Donmez, E.; Senturk, B. A hybrid DNN–LSTM model for detecting phishing URLs. Neural Comput. Appl. 2021. [Google Scholar] [CrossRef] [PubMed]
- Roy, S.S.; Awad, A.I.; Amare, L.A.; Erkihun, M.T.; Anas, M. Multimodel Phishing URL Detection Using LSTM, Bidirectional LSTM, and GRU Models. Future Internet 2022, 14, 340. [Google Scholar] [CrossRef]
- Rao, R.S.; Vaishnavi, T.; Pais, A.R. CatchPhish: Detection of phishing websites by inspecting URLs. J. Ambient. Intell. Humaniz. Comput. 2020, 11, 813–825. [Google Scholar] [CrossRef]
- Li, T.; Kou, G.; Peng, Y. Improving malicious URLs detection via feature engineering: Linear and nonlinear space transformation methods. Inf. Syst. 2020, 91, 101494. [Google Scholar] [CrossRef]
- Iyengar, A.; Kalpana, G.; Kalyankumar, S.; GunaNandhini, S. Integrated SPAM detection for multilingual emails. In Proceedings of the 2017 International Conference on Information Communication and Embedded Systems (ICICES), Chennai, India, 23–24 February 2017; pp. 1–4. [Google Scholar] [CrossRef]
- Rastenis, J.; Ramanauskaitė, S.; Suzdalev, I.; Tunaityte, K.; Janulevicius, J.; Cenys, A. Multi-Language Spam/Phishing Classification by Email Body Text: Toward Automated Security Incident Investigation. Electronics 2021, 10, 668. [Google Scholar] [CrossRef]
- Gulli, A.; Kapoor, A.; Pal, S. Deep Learning with TensorFlow 2 and Keras; Packt Publishing Ltd.: Birmingham, UK, 2019. [Google Scholar]
- Patterson, J.; Gibson, A. Deep Learning. A Practitional Approach; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2017. [Google Scholar]
Attribute (Metric) | Description |
---|---|
opens count | number of opens |
unique opens count | number of unique opens |
clicks count | number of clicks (at least one link) |
unique clicks count | number of unique clicks |
soft bounced count | number of soft bounced |
hard bounced count | number of hard bounced |
resigned count | number of resigns |
complaint count | number of complaints |
No | Opens | Unique Opens | Clicks | Unique Clicks | Soft Bounced | Hard Bounced | Resigned | Complaint |
---|---|---|---|---|---|---|---|---|
0 | 32.98% | 23.36% | 9.25% | 7.12% | 0.23% | 2.43% | 0.16% | 0.00% |
1 | 5.89% | 4.81% | 1.76% | 1.43% | 7.48% | 2.42% | 0.07% | 0.00% |
2 | 28.66% | 17.72% | 2.92% | 2.38% | 0.44% | 2.02% | 2.15% | 0.00% |
3 | 8.51% | 7.25% | 2.53% | 2.13% | 0.22% | 1.53% | 0.05% | 0.00% |
4 | 1.10% | 0.95% | 0.13% | 0.11% | 0.12% | 0.75% | 0.01% | 0.00% |
5 | 1.62% | 1.37% | 0.20% | 0.17% | 0.09% | 0.72% | 0.01% | 0.00% |
6 | 128.65% | 48.46% | 43.69% | 24.45% | 0.29% | 2.33% | 0.62% | 0.02% |
7 | 1.23% | 1.02% | 0.30% | 0.25% | 0.47% | 20.71% | 0.04% | 0.00% |
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
0 | 0.98 | 0.98 | 0.98 | 375,025 |
1 | 0.93 | 0.93 | 0.93 | 124,975 |
accuracy | 0.97 | 500,000 | ||
macro avg | 0.95 | 0.96 | 0.95 | 500,000 |
weighted avg | 0.97 | 0.97 | 0.97 | 500,000 |
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
0 | 0.98 | 0.98 | 0.98 | 250,123 |
1 | 0.98 | 0.98 | 0.98 | 249,877 |
accuracy | 0.98 | 500,000 | ||
macro avg | 0.98 | 0.98 | 0.98 | 500,000 |
weighted avg | 0.98 | 0.98 | 0.98 | 500,000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Szpyrka, M.; Suszalski, P.; Obara, S.; Nalepa, G.J. Email Campaign Evaluation Based on User and Mail Server Response. Appl. Sci. 2023, 13, 1630. https://doi.org/10.3390/app13031630
Szpyrka M, Suszalski P, Obara S, Nalepa GJ. Email Campaign Evaluation Based on User and Mail Server Response. Applied Sciences. 2023; 13(3):1630. https://doi.org/10.3390/app13031630
Chicago/Turabian StyleSzpyrka, Marcin, Piotr Suszalski, Sebastian Obara, and Grzegorz J. Nalepa. 2023. "Email Campaign Evaluation Based on User and Mail Server Response" Applied Sciences 13, no. 3: 1630. https://doi.org/10.3390/app13031630
APA StyleSzpyrka, M., Suszalski, P., Obara, S., & Nalepa, G. J. (2023). Email Campaign Evaluation Based on User and Mail Server Response. Applied Sciences, 13(3), 1630. https://doi.org/10.3390/app13031630