AN INTERACTIVE SPAM DETECTION MODEL USING AN ENSEMBLE ALGORITHIM
DOI:
https://doi.org/10.46754/jmsi.2026.06.007Keywords:
Spam messages, SMS, preprocessing, TF-IDF, feature extractionAbstract
Short Message Service (SMS) spam remains a significant issue in mobile communication systems. This study presents a framework for identifying spam messages in SMS communications using an ensemble of machine learning techniques for classification. The proposed framework involves a data preprocessing procedure to prepare the raw SMS dataset, followed by feature extraction using the Term Frequency–Inverse Document Frequency (TF-IDF) technique to represent textual data numerically. This enables the model to capture relevant characteristics from the processed data. During this experiment, the Ensemble model achieved 97% accuracy, outperforming Support Vector Machine (SVM) K-Nearest Neighbour (KNN), and Random Forest (RF), which achieved accuracies of 70.93%, 73.21%, and 79.18%, respectively. The proposed approach enhances user security in mobile communication and provides a comprehensive solution to the persistent issue of SMS spam, providing an effective spam detection system.
References
Ahmed, N., Amin, R., Aldabbas, H., Koundal, D., Alouffi, B., & Shah, T. (2022). Machine learning techniques for spam detection in email and IoT platforms: Analysis and research challenges. Security and Communication Networks, 2022, Article 1862888. https://doi.org/10.1155/2022/1862888
Airlangga, G. (2024). Optimizing SMS spam detection using machine learning: A comparative analysis of ensemble and traditional classifiers. Journal of Computer Networks, Architecture and High Performance Computing, 6(4), 1942–1951. https://doi.org/10.47709/cnahpc.v6i4.4822
Al-Kabbi, H. A., Feizi-Derakhshi, M. R., & Pashazadeh, S. (2023). Multi-type feature extraction and early fusion framework for SMS spam detection. IEEE Access, 11, 123756–123765.
Al-shanableh, N., Alzyoud, M. S., & Nashnush, E. (2024). Enhancing email spam detection through ensemble machine learning: A comprehensive evaluation of model integration and performance. Communications of the IIMA, 22(1), Article 2. https://doi.org/10.58729/1941-6687.1451
Bharadiya, J. P. (2023). A comparative study of business intelligence and artificial intelligence with big data analytics. American Journal of Artificial Intelligence, 7(1), 24–30. https://doi.org/10.11648/j.ajai.20230701.14
Bouke, M. A., Alramli, O. I., & Abdullah, A. (2025). XAIRF-WFP: A novel XAI-based random forest classifier for advanced email spam detection. International Journal of Information Security, 24, Article 5. https://doi.org/10.1007/s10207-024-00920-1
De Goma, J., Bravo, J. A., Prudente, S., & Rondilla, R. F. (2024, February). Detection of SMS spam messages using TF-IDF vectorizer and deep learning models. In Proceedings of the 2024 9th International Conference on Intelligent Information Technology (pp. 245–249).
Ejirika, E. R., & Omotehinwa, T. O. (2024, April). Analysis of machine learning models for spam email detection and real-time integration. In Proceedings of the International Conference on Science, Engineering and Business for Driving Sustainable Development Goals 2024 (SEB4SDG) (pp. 1–10). IEEE.
Gadde, S., Lakshmanarao, A., & Satyanarayana, S. (2021, March). SMS spam detection using machine learning and deep learning techniques. In Proceedings of the 2021 International Conference on Advanced Computing and Communication Systems (ICACCS) (Vol. 1, pp. 358–362). IEEE.
Hadi, M. T., & Baawi, S. S. (2024, January). Email spam detection by machine learning approaches: A review. In Proceedings of the International Conference on Forthcoming Networks and Sustainability in the AIoT Era (pp. 186–204). Springer.
Kalolo, C., & Mbelwa, J. (2023, November). Comparative analysis of machine learning models for detecting mobile messaging spam in Swahili SMS. In Proceedings of the 2023 International Conference on the Advancements of Artificial Intelligence in African Context (AAIAC) (pp. 1–7). IEEE.
Kalyani, V. V., Sundari, M. R., Neelima, S., Prasad, P. S. S., Mohan, P. P., & Lakshmanarao, A. (2024, April). SMS spam detection using NLP and deep learning recurrent neural network variants. In Proceedings of the 2024 International Conference on Cognitive Robotics and Intelligent Systems (ICC-ROBINS) (pp. 92–96). IEEE.
Mambina, I. S., Ndibwile, J. D., Uwimpuhwe, D., & Michael, K. F. (2024). Uncovering SMS spam in Swahili text using deep learning approaches. IEEE Access, 12, 25164–25175.
Naeem, M. Z., Rustam, F., Mehmood, A., Ashraf, I., & Choi, G. S. (2022). Classification of movie reviews using term frequency-inverse document frequency and optimized machine learning algorithms. PeerJ Computer Science, 8, Article e914.
Nallamothu, P. T., & Khan, M. S. (2023). Machine learning for spam detection. Asian Journal of Advances in Research, 6(1), 167–179.
Oyeyemi, D. A., & Ojo, A. K. (2024). SMS spam detection and classification to combat abuse in telephone networks using natural language processing. arXiv. https://arxiv.org/abs/2406.06578
Patel, D., Saxena, S., Verma, T., & Student, P. G. (2016). Sentiment analysis using maximum entropy algorithm in big data. International Journal of Innovative Research in Science, Engineering and Technology, 5(5).
Qazi, A., Hasan, N., Mao, R., Abo, M. E. M., Dey, S. K., & Hardaker, G. (2024). Machine learning-based opinion spam detection: A systematic literature review. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3417438
Saeed, V. A. (2023). A method for SMS spam message detection using machine learning. Artificial Intelligence & Robotics Development Journal, 3(1), 214–228. https://doi.org/10.52098/airdj.202366
Srinivasarao, U., & Sharaff, A. (2023). Machine intelligence based hybrid classifier for spam detection and sentiment analysis of SMS messages. Multimedia Tools and Applications, 82(20), 31069–31099.
Wang, C., Pang, M., Wu, T., Gao, F., Zhao, L., Chen, J., Wang, W., Wang, D., & Zhang, P. (2024). Resilient massive access for SAGIN: A deep reinforcement learning approach. IEEE Journal on Selected Areas in Communications, 43(1), 297–313. https://doi.org/10.1109/JSAC.2024.3460030
Wang, L. (Ed.). (2005). Support vector machines: Theory and applications (Vol. 177). Springer Science & Business Media.
Wang, L., Fan, M., Yang, N., Ma, X., Liang, Y., & Zhang, H. (2025). Toward intelligent space-air-ground integrated network: Architecture, challenges, and emerging directions. Journal of Communications and Information Networks, 10(2), 87–102.
Wu, C., Wang, X., Hu, Y., Han, S., Meng, W., & Niyato, D. (2025). Towards intelligent SAGIN: Leveraging big AI models and SDN for end-to-end automation. IEEE Network.
Xie, S., Wang, G., Lin, S., & Yu, P. S. (2012, August). Review spam detection via temporal pattern discovery. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 823–831). ACM.
Yin, Z., Luan, T. H., Cheng, N., Hui, Y., & Wang, W. (2022). Cybertwin-enabled 6G space-air-ground integrated networks: Architecture, open issue, and challenges. arXiv. https://arxiv.org/abs/2204.12153
Zhang, Q., & Fiorella, L. (2023). An integrated model of learning from errors. Educational Psychologist, 58(1), 18–34.
Zhang, Y., & Guo, D. (2015). Zhang functions and various models. Springer.
Zimba, A., Phiri, K. O., Kashale, C., & Phiri, M. N. (2024). A machine learning and natural language processing-based smishing detection model for mobile money transactions. International Journal on Information Technologies & Security, 16(3).
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Journal of Mathematical Sciences and Informatics

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

