Research of Machine Learning Methods for Detecting Network Attacks
https://doi.org/10.15514/ISPRAS-2025-37(4)-24
Abstract
The problem of detecting network attacks is becoming particularly important in the context of the increasing complexity of cyber threats and the limitations of traditional signature methods. This paper provides a comprehensive analysis of five machine learning algorithms with a focus on interpretability of models and processing of unbalanced Simulated Network Traffic data. The main objective is to increase the accuracy of detecting cyber-attacks, including DDoS and port scanning, using a decision tree, logistic regression, random forest and other methods. The study was performed in Python 3.13 using the scikit-learn, XGBoost and TensorFlow libraries. The choice of tools is determined by the specifics of the task: for classical methods (trees, logistic regression) and ensemble approaches (Random Forest, XGBoost), scikit-learn turned out to be optimal, and for neural network experiments (RProp MLP) TensorFlow/Keras provided a user-friendly interface for prototyping. PyTorch was not used because it did not provide advantages for binary classification on structured data, but its use could be justified for analyzing sequences or unstructured logs in future research. The decision tree demonstrated the highest accuracy – 99.4% with a depth of 5 and the selection of 8 key features out of 18. After tuning, gradient boosting showed a comparable result – 99.58%, but its training took significantly longer (576 seconds versus 69 for the decision tree). The random forest achieved 97.98% accuracy, while the logistic regression achieved 96.53%. Naive Bayes proved to be the least effective (86.48%), despite attempts to improve using PCA. The linear regression transformed into a classifier showed an accuracy of 94.94%, which is lower than the ensemble methods, but acceptable for the basic approach. The practical value of the work is confirmed by testing on real network data. The results obtained can form the basis of hybrid systems combining several algorithms to increase detection reliability. For example, combining a fast decision tree for primary analysis and gradient boosting to refine complex cases will allow you to balance between speed and accuracy. Separately, it is worth noting the importance of interpretability of models: trees and logistic regression not only showed good results but also allowed us to identify key signs of attacks, which is critical for integration into existing security systems.
Keywords
About the Authors
Maria Anatolyevna LAPINARussian Federation
Cand. Sci. (Phys.-Math.), Associate Professor at the Department of Computational Mathematics and Cybernetics at the North Caucasus Federal University. Research interests: digital technologies, information security management, process approach, and cryptography.
Nazar Vladimirovich PODRUCHNY
Russian Federation
Student of the North Caucasus Federal University. Research interests: cryptography, machine learning, digital technologies, information security management, process approach, and educational process.
Mikhail Andreevich RUSANOV
Russian Federation
Postgraduate student at the Institute of Information Technologies at the Moscow University of Finance and Law. Research interests: complex information protection systems, Information and Communication Technologies.
Mikhail Grigoryevich BABENKO
Russian Federation
Dr. Sci. (Phys.-Math.), Head of the Department of Computational Mathematics and Cybernetics at the North Caucasus Federal University. Research interests: algebraic structures in Galois fields, modular arithmetic, neurocomputer technologies, digital signal processing, and cryptographic methods of information protection.
References
1. Kuzior A., et al. Cybersecurity and cybercrime: Current trends and threats. Journal of International Studies, vol. 17, no. 2, 2024, pp. –.
2. Abdelkader S., et al. Securing modern power systems: Implementing comprehensive strategies to enhance resilience and reliability against cyber-attacks. Results in Engineering, 2024, article 102647.
3. Singh N. J., et al. Botnet-based IoT network traffic analysis using deep learning. Security and Privacy, vol. 7, no. 2, 2024, e355.
4. Alsaleh A. A novel intrusion detection model of unknown attacks using convolutional neural networks. Computer Systems Science & Engineering, vol. 48, no. 2, 2024.
5. Inuwa M. M., Das R. A comparative analysis of various machine learning methods for anomaly detection in cyber-attacks on IoT networks. Internet of Things, vol. 26, 2024, article 101162.
6. Ayodele T. O. Types of machine learning algorithms. New Advances in Machine Learning, vol. 3,
7. pp. 19-48, 2010.
8. So-In C. A survey of network traffic monitoring and analysis tools. CSE 576M Computer System Analysis Project, Washington University in St. Louis, 2009.
9. Azab A., et al. Network traffic classification: Techniques, datasets, and challenges. Digital Communications and Networks, vol. 10, no. 3, 2024, pp. 676-692.
10. Ghosh K., et al. The class imbalance problem in deep learning. Machine Learning, vol. 113, no. 7, 2024, pp. 4845-4901.
11. Fillbrunn A., et al. KNIME for reproducible cross-domain analysis of life science data. Journal of Biotechnology, vol. 261, 2017, pp. 149-156.
12. Ndung'u R. N. Data preparation for machine learning modelling, 2022.
13. Brownlee J. Data preparation for machine learning: data cleaning, feature selection, and data transforms in Python. Machine Learning Mastery, 2020.
14. Pitropakis N., et al. A taxonomy and survey of attacks against machine learning. Computer Science Review, vol. 34, 2019, article 100199.
15. Park K., Song Y., Cheong Y.-G. Classification of attack types for intrusion detection systems using a machine learning algorithm. Proc. 2018 IEEE Fourth Int. Conf. on Big Data Computing Service and Applications (BigDataService), 2018.
16. Chakraborty S., et al. Interpretability of deep learning models: A survey of results. 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, etc., 2017.
17. Turukmane A. V., Devendiran R. M-MultiSVM: An efficient feature selection assisted network intrusion detection system using machine learning. Computers & Security, vol. 137, 2024, article 103587.
18. Helpiks. https://helpiks.org/7-89924.html.
19. Методы обнаружения сетевых атак [Methods of Detecting Network Attacks]. Otkrytye Sistemy, no. 7-8, 2002, pp. 181-714. Доступно по ссылке: www.osp.ru/os/2002/07-08/181714.
20. Boldyrikhin N. V., et al. Research of Intrusion Detection Systems. Molodoy Uchenyy [Young Scientist], no. 2 (449), 2023, pp. 6-9. https://moluch.ru/archive/449/98876/. Accessed 22 Apr. 2025.
21. Zhu R., Zhong G.-Y., Li J.-C. Forecasting price in a new hybrid neural network model with machine learning. Expert Systems with Applications, vol. 249, 2024, article 123697.
22. Dlamini T., Zulu N. Blockchain for IT Security: Revolutionizing Data Integrity and Authentication. Eastern European Journal for Multidisciplinary Research, vol. 3, no. 2, 2024, pp. 357-366.
23. Mendeley Data. https://data.mendeley.com/datasets/9hz6f62gtk/1.
24. Mienye I. D., Jere N. A survey of decision trees: Concepts, algorithms, and applications. IEEE Access, 2024.
25. Singh H. P., et al. Logistic Regression based Sentiment Analysis System: Rectify. 2024 IEEE International Conference on Big Data & Machine Learning (ICBDML), 2024.
26. Lai T., et al. Ensemble learning based anomaly detection for IoT cybersecurity via Bayesian hyperparameters sensitivity analysis. Cybersecurity, vol. 7, no. 1, 2024, pp. 44.
27. Hadi A. A. A., Hadi A. M. Improving cybersecurity with random forest algorithm-based big data intrusion detection system: A performance analysis. AIP Conference Proceedings, vol. 3051, no. 1, 2024.
28. Sekhar J. C., et al. Stochastic Gradient Boosted Distributed Decision Trees Security Approach for Detecting Cyber Anomalies and Classifying Multiclass Cyber-Attacks. Computers & Security, 2025, article 104320.
29. Sangeetha J. M., Alfia K. J. Financial stock market forecast using evaluated linear regression-based machine learning technique. Measurement: Sensors, vol. 31, 2024, article 100950.
30. Igel C., Hüsken M. Empirical evaluation of the improved Rprop learning algorithms. Neurocomputing, vol. 50, 2003, pp. 105-123.
31. Ebrahimi M., et al. Comprehensive analysis of machine learning models for prediction of sub-clinical mastitis: Deep Learning and Gradient-Boosted Trees outperform other models. Computers in Biology and Medicine, vol. 114, 2019, article 103456.
32. Jun W., Shitong W., Chung F.-L. Positive and negative fuzzy rule system, extreme learning machine and image classification. International Journal of Machine Learning and Cybernetics, vol. 2, 2011,
33. pp. 261-271.
Review
For citations:
LAPINA M.A., PODRUCHNY N.V., RUSANOV M.A., BABENKO M.G. Research of Machine Learning Methods for Detecting Network Attacks. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(4):147-174. https://doi.org/10.15514/ISPRAS-2025-37(4)-24