Intelligent Algorithms for Detecting Attacks in the Web Environment
https://doi.org/10.15514/ISPRAS-2024-36(4)-8
Abstract
The article is devoted to the analysis of the use of machine learning algorithms to detect attacks using a custom web environment or the functionality of user applications. Learning with a teacher and clustering algorithms are considered. The dataset uses a sample of online shopping transactions collected by an e-commerce retailer. The dataset contains 39,221 transactions. To detect attacks in the web environment, the most optimal implementations of machine learning algorithms were selected after their review and comparative analysis. The most effective algorithm for detecting fraudulent transactions has been determined. We use the accuracy and running time of the algorithm as criteria. The accuracy of detecting fraudulent transactions for Random Forest, GB (Scikit-learn), GB (CatBoost) algorithms is 100%, and the KD-trees algorithm is 99,9%. The gradient boosting algorithm in the CatBoos implementation is 4,2 times faster than Random Forest, 2,4 times faster than GB Scikit-learn, 1,2 times faster than GB without using the cat_features parameter, 41,9 times faster than k-dimensional trees, 66,8 times faster than DBSCAN. The data obtained for each method is presented in the form of tables. Within the framework of this work, the parameters for evaluating the effectiveness of the algorithms under study are learning time indicators, as well as characteristics from the Confusion matrix and Classification Report for classification algorithms, and fowlkes_mallows_score, rand_score, adjusted_rand_score, Homogeneity, Completeness, V-measure for clustering algorithms.
About the Authors
Maria Anatolyevna LAPINARussian Federation
Cand. Sci. (Phys.-Math.), Associate Professor of the Department of Information Security of Automated Systems of the North Caucasus Federal University. Research interests: digital technologies, information security management, process approach, educational process, cryptography.
Vitaliya Valentinovna MOVZALEVSKAYA
Russian Federation
Student of the Department of Information Security of Automated Systems of the North Caucasus Federal University. Research interests: programming, machine learning.
Marina Evgenievna TOKMAKOVA
Russian Federation
Student of the Department of Information Security of Automated Systems of the North Caucasus Federal University. Research interests: cryptography, machine learning.
Mikhail Grigorievich BABENKO
Russian Federation
Dr. Sci. (Phys.-Math.), Head of the Department of Computational Mathematics and Cybernetics of the North Caucasus Federal University. Research interests: algebraic structures in Galois fields, modular arithmetic, neurocomputer technologies, digital signal processing, cryptographic methods of information protection.
Viktor Pavlovich KOCHIN
Belarus
Cand. Sci. (Tech.), Vice-Rector for Academic Affairs and Internationalization of Education of the Belarusian State University. Research interests: integrated information security systems, Information and communication technologies.
References
1. T. Hastie, R. Tibshirani, J. Friedman, “The Elements of Statistical Learning”, p. 745, August 2009.
2. C. Chio, D. Freeman, “Machine Learning and Security”, p. 386, February 2017.
3. Power, R. “Tangled Web: Tales of Digital Crime from the Shadows of Cyberspace”, pp. 396-397, 2000.
4. Uyazvimosti i ugrozy veb-prilozhenij v 2020-2021 gg. Official website – URL: https://www.ptsecurity.com/ru-ru/research/analytics/web-vulnerabilities-2020-2021/#id5.
5. Andress, J., “The Basics of Information Security”, Second Edition, Chapter 3, Authorization and Access Control, p. 190, 2014.
6. R. Hamsa Veni, A. Hariprasad Reddy, C. Kesavulu, “Identifying Malicious Web Links and Their Attack Types in Social Networks”, International Journal of Scientific Research in Computer Science, Engineering and Information Technology, pp.1060-1066, March-April 2018.
7. R. F. Fouladi, C. E. Kayatas, E. Anarim, “Frequency based DDoS attack detection approach using naive Bayes classification”, International Conference on Telecommunications and Signal Processing (TSP), June 2016.
8. Todd A. Stephenson, “An Introduction to Bayesian Network Theory and Usage”, p. 29, 2000.
9. D. Atienza, A. Herrero, E. Corchado, “Neural analysis of http traffic for web attack detection”, Computational Intelligence in Security for Information Systems Conference, pp.201-212, January 2015.
10. B. Goyal, M. Bansal, “Competent Approach for Type of Phishing Attack Detection Using Multi-Layer Neural Network”, International Journal of Advanced Engineering Research and Science, pp. 210-215, January 2017.
11. Hervé Abdi, “A neural network primer”, Journal of Biological Systems, pp. 247-281, 1994.
12. Sivak M. A., Timofeev V. S., “Configuring robust neural networks to solve the classification problem”, Reports of Tomsk State University of Control Systems and Radioelectronics, pp.26-32, 2021.
13. N. Florian Epp, R. Funk, C. R. Cappo, “Anomaly-based web application firewall using HTTP-specific features and one-class SVM”, September 2017.
14. Z. Tian, “Distributed Deep Learning System for Web Attack Detection on Edge Devices”, IEEE Transactions on Industrial Informatics, November 2019.
15. Ye Jin, “A DdoS attack detection method based on SVM in software defined network”, Security and Communication Networks, April 2018.
16. S. Suthaharan, “Support Vector Machine”, Machine Learning Models and Algorithms for Big Data Classification, pp. 207-235, January 2016.
17. Otchet ob atakah na onlajn-resursy rossijskih kompanij. official website [https://www.ptsecurity.com/ru-ru/] – URL: https://rt-solar.ru/upload/iblock/34a/5w4h9o57axovdbv3ng7givrz271ykir3/Ataki-na-onlayn_resursy-rossiyskikh-kompaniy-v-2022-godu.pdf?ysclid=lubdnvft2p622633541.
18. Aktual'nye kiberugrozy: itogi 2021 goda. official website [https://www.ptsecurity.com/ru-ru/] – URL: https://www.ptsecurity.com/ru-ru/research/analytics/cybersecurity-threatscape-2021/.
19. Garfinkel, S. and Spafford, E.H., “Web Security and Commerce”, O'Reilly and Associates, pp. 450-470, February 1997.
20. Martynov A., Kandybla V., “Metod gradientnogo spuska v mashinnom obuchenii”, Zhurnal «Shag v nauku», pp. 4-8, 2022.
21. CatBoost is a high-performance open-source library for gradient boosting on decision trees. official website [https://catboost.ai/?ysclid=lwdjifxlqs185272725] – URL: https://catboost.ai/en/docs/concepts/speed-up-training?ysclid=lwdk46v95o460729700.
22. Gradient Boosting. official website [https://scikit-learn.org/stable/] – URL: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html.
Review
For citations:
LAPINA M.A., MOVZALEVSKAYA V.V., TOKMAKOVA M.E., BABENKO M.G., KOCHIN V.P. Intelligent Algorithms for Detecting Attacks in the Web Environment. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2024;36(4):99-116. https://doi.org/10.15514/ISPRAS-2024-36(4)-8