Enhancing the Optimization of BI-LSTM Classifier with Ensemble Methods, Regularization, and Cross-Validation Techniques for Email Spam Detection

Authors

  • Arepalli Gopi Research Scholar, Department of Computer Science & Engineering, Annamalai University, Chidambaram, Tamilnadu, India Author
  • Sudha L.R Associate Professor, Department of Computer Science & Engineering, Annamalai University, Chidambaram Author
  • Iwin Thanakumar Joseph S Associate Professor, Department of Computer Science & Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, India Author

DOI:

https://doi.org/10.62486/agmu202544

Keywords:

Machine Learning, Bi LSTM, LSTM, Sigmoid, Optimization, Regularization, K-Cross Fold, Ensemble

Abstract

Email spam, a persistent and escalating issue, continues to disrupt the digital communication landscape, causing inconvenience and time loss for users worldwide. With technological advancements, spammers continually adapt and refine their tactics to infiltrate email inboxes. Staying current with state-of-the-art anti-spam techniques is imperative to secure emails and eliminate unwanted messages. Our research work embarks on an exploration of supercharging email spam detection through the augmentation of a Bidirectional Long Short-Term Memory (BI-LSTM) classifier. Our approach integrates ensemble methods, regularization techniques, and cross-validation into the fabric of the BI-LSTM model, creating a formidable spam detection system. Our paper delves into the intricate technical aspects of these methodologies, elucidating their synergy in fortifying the classifier's performance

References

Gupta, Vashu, Aman Mehta, Akshay Goel, Utkarsh Dixit, and Avinash Chandra Pandey. "Spam detection using ensemble learning." In Harmony Search and Nature Inspired Optimization Algorithms: Theory and Applications, ICHSA 2018, pp. 661-668. Springer Singapore, 2019. DOI: https://doi.org/10.1007/978-981-13-0761-4_63

Adnan, Muhammad, Muhammad Osama Imam, Muhammad Furqan Javed, and Iqbal Murtza. "Improving spam email classification accuracy using ensemble techniques: a stacking approach." International Journal of Information Security (2023): 1-13.. DOI: https://doi.org/10.1007/s10207-023-00756-1

Agarwal, Deepak Kumar, and Rahul Kumar. "Spam filtering using SVM with different kernel functions." International Journal of Computer Applications 136, no. 5 (2016): 16-23.. DOI: https://doi.org/10.5120/ijca2016908395

Lanka, Sai Charan, Kommana Akhila, Kodali Pujita, P. Vidya Sagar, Shayan Mondal, and Suneetha Bulla. "Spam based Email Identification and Detection using Machine Learning Techniques." In 2023 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), pp. 69-74. IEEE, 2023. DOI: https://doi.org/10.1109/ICSCDS56580.2023.10104659

Bacanin, Nebojsa, Miodrag Zivkovic, Catalin Stoean, Milos Antonijevic, Stefana Janicijevic, Marko Sarac, and Ivana Strumberger. "Application of natural language processing and machine learning boosted with swarm intelligence for spam email filtering." Mathematics 10, no. 22 (2022): 4173. DOI: https://doi.org/10.3390/math10224173

Al-Rawashdeh, Ghada, Rabiei Mamat, and Noor Hafhizah Binti Abd Rahim. "Hybrid water cycle optimization algorithm with simulated annealing for spam e-mail detection." IEEE Access 7 (2019): 143721-143734. DOI: https://doi.org/10.1109/ACCESS.2019.2944089

Shuaib, Maryam, Shafi’I. Muhammad Abdulhamid, Olawale Surajudeen Adebayo, Oluwafemi Osho, Ismaila Idris, John K. Alhassan, and Nadim Rana. "Whale optimization algorithm-based email spam feature selection method using rotation forest algorithm for classification." SN Applied Sciences 1 (2019): 1-17. DOI: https://doi.org/10.1007/s42452-019-0394-7

Karim, Asif, Sami Azam, Bharanidharan Shanmugam, Krishnan Kannoorpatti, and Mamoun Alazab. "A comprehensive survey for intelligent spam email detection." IEEE Access 7 (2019): 168261-168295. DOI: https://doi.org/10.1109/ACCESS.2019.2954791

Salloum, Said, Tarek Gaber, Sunil Vadera, and Khaled Shaalan. "A systematic literature review on phishing email detection using natural language processing techniques." IEEE Access 10 (2022): 65703-65727. DOI: https://doi.org/10.1109/ACCESS.2022.3183083

Shaik, China Moulali, Narasimha Murthy Penumaka, Suneel Kumar Abbireddy, Vinod Kumar, and S. S. Aravinth. "Bi-LSTM and Conventional Classifiers for Email Spam Filtering." In 2023 Third International Conference on Artificial Intelligence and Smart Energy (ICAIS), pp. 1350-1355. IEEE, 2023. DOI: https://doi.org/10.1109/ICAIS56108.2023.10073776

Abid, Muhammad Adeel, Saleem Ullah, Muhammad Abubakar Siddique, Muhammad Faheem Mushtaq, Wajdi Aljedaani, and Furqan Rustam. "Spam SMS filtering based on text features and supervised machine learning techniques." Multimedia Tools and Applications 81, no. 28 (2022): 39853-39871. DOI: https://doi.org/10.1007/s11042-022-12991-0

Anshumaanmishra, & VigneshwaranPandi,. (2022). Classifications of E-MAIL SPAM Using Deep Learning Approaches. 10.3233/APC220058. DOI: https://doi.org/10.3233/APC220058

Kanmani, Sujithra, and Surendiran Balasubramanian. "Leveraging Readability and Sentiment in Spam Review Filtering Using Transformer Models." Computer Systems Science & Engineering 45, no. 2 (2023). DOI: https://doi.org/10.32604/csse.2023.029953

Bhuiyan, H., Ashiquzzaman, A., Juthi, T. I., Biswas, S., & Ara, J. (2018). A survey of existing e-mail spam filtering methods considering machine learning techniques. Global Journal of Computer Science and Technology, 18(2), 20-29.

Kulkarni, Ritwik, Mercè Vintró, Stelios Kapetanakis, and Michele Sama. "Performance comparison of popular text vectorising models on multi-class email classification." In Intelligent Systems and Applications: Proceedings of the 2018 Intelligent Systems Conference (IntelliSys) Volume 1, pp. 567-578. Springer International Publishing, 2019 DOI: https://doi.org/10.1007/978-3-030-01054-6_41

https://www.kaggle.com/datasets/venky73/spam-mails-dataset/?select=spam_ham_dataset.csv

https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset

Downloads

Published

2025-01-01

How to Cite

1.
Gopi A, Sudha L, Iwin Thanakumar JS. Enhancing the Optimization of BI-LSTM Classifier with Ensemble Methods, Regularization, and Cross-Validation Techniques for Email Spam Detection. Multidisciplinar (Montevideo) [Internet]. 2025 Jan. 1 [cited 2025 Feb. 14];3:44. Available from: https://multidisciplinar.ageditor.uy/index.php/multidisciplinar/article/view/44