Примеры использования машинного обучения в кибербезопасности
https://doi.org/10.15514/ISPRAS-2019-31(5)-15
Аннотация
Проблему использования машинного обучения в кибербезопасности трудно решить, поскольку достижения в этой области открывают так много возможностей, что сложно найти действительно хорошие варианты решения реализации и принятия решений. Более того эти технологии также могут использоваться злоумышленниками для кибератаки. Цель этой статьи - сделать обзор на актуальные технологии в кибербезопасности и кибератаках, использующие машинное обучение, и представить модель атаки на основе машинного обучения.
Об авторах
Сергей Михайлович АвдошинНациональный исследовательский университет «Высшая школа экономики»
Россия
Кандидат технических наук, профессор, руководитель департамента программной инженерии факультета компьютерных наук
Александр Вячеславович Лазаренко
Россия
Руководитель департамента инноваций и разработки продукто
Наталия Игоревна Чичилева
Россия
Младший специалист департамента инноваций и разработки продуктов
Павел Андреевич Наумов
Россия
Младший специалист департамента инноваций и разработки продуктов
Петр Георгиевич Ключарев
Россия
Кандидат технических наук, доцент кафедры «Информационная безопасность»
Список литературы
1. P. Krensky, J. Hare. Hype Cycle for Data Science and Machine Learning, 2018. Gartner, 2018. Accessed: Sep. 10, 2019. [Online] Available at: https://www.gartner.com/en/documents/3883664/hype-cycle-for-data-science-and-machine-learning-2018.
2. Nils J. Nilsson. Artificial Intelligence: A New Synthesis. Elsevier Inc, 1998, 513 p.
3. Businesses recognize the need for AI & ML tools in cybersecurity. Helpnetsecurity.com. Accessed: Sep. 10, 2019. [Online] Available at: https://www.helpnetsecurity.com/2019/03/14/ai-ml-tools-cybersecurity/.
4. T. Mitchell. Machine Learning. A Guide to Current Research. Tom M. Mitchell, Jaime G. Carbonell, Ryszard S. Michalski (Eds.). Springer Science & Business Media, 1986, 429 p.
5. J. Grus. Data Science from Scratch: First Principles with Python. O'Reilly Media, 2015, 330 p.
6. L. Deng, D. Yu. Deep Learning: Methods and Applications. Foundations and Trends in Signal Processing, vol. 7, nos. 3–4, 2014, pp. 199- 200
7. K. Warr. Strengthening Deep Neural Networks: Making AI Less Susceptible to Adversarial Trickery. O'Reilly Media, Inc., 2019, 246 p.
8. T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen. Improved Techniques for Training GANs. arXiv:1606.03498, 2016.
9. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S.Ozair, A. Courville, Y. Bengio, Generative Adversarial Networks. arXiv:1406.2661, 2014.
10. J. Han, J. Pei, M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 3rd edition, 2011, 744 p.
11. P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, R. Wirth. CRISPDM 1.0 step-by-step data mining guide. SPSS, 2000, 78 p.
12. S. Dilek, H. Çakır, M. Aydın. Applications Of Artificial Intelligence Techniques To Combating Cyber Crimes: A Review. International Journal of Artificial Intelligence & Applications (IJAIA), vol. 6, vo. 1, 2015, pp. 21-39.
13. S. Revathi and A. Malathi. A Detailed Analysis on NSL-KDD Dataset Using Various Machine Learning Techniques for Intrusion Detection. International Journal of Engineering Research and Technology, vol. 2, issue 12, 2013, pp. 1848-1853.
14. L. Buczak and E. Guven. A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection. IEEE Communications Surveys & Tutorials, vol. 18, no. 2, 2016, pp. 1153–1176.
15. W. Melicher, B. Ur, S.Segreti, S. Komanduri, L. Bauer, N. Christin, L. Cranor. Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks. In Proc. of the 25th USENIX Security Symposium, 2016, pp. 176-191.
16. Ciaramella, P. D’Arco, A. De Santis, C. Galdi, R. Tagliaferri. Neural Network Techniques for Proactive Password Checking. IEEE Transactions on Dependable and Secure Computing, vol. 3, no. 4, 2006, pp. 327-339.
17. Chris Brook. What is User and Entity Behavior Analytics? A Definition of UEBA, Benefits, How It Works, and More. Accessed: Oct. 10, 2019. [Online]. Available at: https://digitalguardian.com/blog/what-user-and-entity-behavior-analytics-definition-ueba-benefits-how-it-works-and-more
18. Anna L. Buczak, Erhan Guven. A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection. IEEE Communications Surveys & Tutorials, vol. 18, no. 2, 2016, pp. 1153-1176.
19. E. Kaspersky. Laziness, Cybersecurity, and Machine Learning. Accessed: Oct. 10, 2019. [Online]. Available: https://eugene.kaspersky.com/2016/09/26/laziness-cybersecurity-and-machine-learning/.
20. J. Roberts. Cyber-Hunting at Scale (CHASE). Accessed: Oct. 19, 2019. [Online]. Available: https://www.darpa.mil/program/cyber-hunting-at-scale.
21. Hernandez-Suarez, G. Sanchez-Perez, K. Toscano-Medina, V. Martinez-Hernandez, H. Perez-Meana, J. Olivares-Mercado, V. Sanchez. Social Sentiment Sensor in Twitter for Predicting Cyber-Attacks Using ℓ1 Regularization. Sensors, vol. 18, no. 5, 2018, pp. 1380.
22. Caliskan, F. Yamaguchi, E. Dauber, R. Harang, K. Rieck, R. Greenstadt, A/ Narayanan. De-anonymizing Programmers via Code Stylometry. In Proc. of the 24th USENIX Security Symposium, 2015, pp. 255-270.
23. Caliskan, F. Yamaguchi, E. Dauber, R. Harang, K. Rieck, R. Greenstadt, A. Narayanan. When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries. arXiv:1512.08546, 2015.
24. S. Repalle, V. Kolluru. Intrusion Detection System using AI and Machine Learning Algorithm. International Research Journal of Engineering and Technology (IRJET), vol. 04, issue 12, 2017, pp. 1709-1715.
25. J. Vacca, S. Ellis. Firewalls: Jumpstart for Network and Systems Administrators. Digital Press, 2004, 448 p.
26. E. Ucar, E. Ozhan. The Analysis of Firewall Policy Through Machine Learning and Data Mining. Wireless Personal Communications, vol. 96, issue 2, 2017, pp. 2891 - 2909.
27. S. Prandl, M. Lazarescu, D. Pham. A Study of Web Application Firewall Solutions. Lecture Notes in Computer Science, vol. 9478, 2015, pp. 501-510.
28. Introduction to Forcepoint DLP Machine Learning. Accessed: Oct. 10, 2019. [Online]. Available at: https://www.websense.com/content/support/library/data/v84/machine_learning/machine_learning.pdf
29. OWASP Top 10 - 2017 The Ten Most Critical Web Application Security Risks. Accessed: Nov. 5, 2019. [Online]. Available at: https://www.owasp.org/images/7/72/OWASP_Top_10-2017_%28en%29.pdf.pdf
30. S. Calzavara, M. Conti, R. Focardi, A. Rabitti, G. Tolomei. Mitch: A Machine Learning Approach to the Black-Box Detection of CSRF Vulnerabilities. In Proc. of the 2019 IEEE European Symposium on Security and Privacy (EuroS&P), 2019, pp. 528-543.
31. G. Pellegrino, M. Johns, S. Koch, M. Backes, C. Rossow. Deemon: Detecting CSRF with Dynamic Analysis and Property Graphs. arXiv:1708.08786, 2017.
32. Z. Mao, N. Li, I. Molloy. Defeating Cross-Site Request Forgery Attacks with Browser-Enforced Authenticity Protection. Lecture Notes in Computer Science, vol. 5628, 2009, pp. 238-255.
33. Philippe De Ryck, Lieven Desmet, Thomas Heyman, Frank Piessens. CsFire: Transparent Client-Side Mitigation of Malicious Cross-Domain Requests. In Proc. of the Second International Symposium on Engineering Secure Software and Systems, 2010, pp. 18-34.
34. Jacob Wilkin. Mapping Social Media with Facial Recognition: A New Tool for Penetration Testers and Red Teamers. Accessed: Oct. 19, 2019. [Online]. Available at: https://www.trustwave.com/en-us/resources/blogs/spiderlabs-blog/mapping-social-media-with-facial-recognition-a-new-tool-for-penetration-testers-and-red-teamers/.
35. R. Zellers, A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, Y. Choi. Defending Against Neural Fake News. arXiv:1905.12616, 2019.
36. J. Seymour, P. Tully. Weaponizing data science for social engineering: Automated E2E spear phishing on Twitter. Accessed: Oct. 19, 2019. [Online]. Available at: https://www.blackhat.com/docs/us-16/materials/us-16-Seymour-Tully-Weaponizing-Data-Science-For-Social-Engineering-Automated-E2E-Spear-Phishing-On-Twitter-wp.pdf
37. S. Thompson. Phight Phraud. Accessed: Nov. 6, 2019. [Online]. Available at: https://www.journalofaccountancy.com/issues/2006/feb/phightphraud.html
38. M. Jakobsson, J. Ratkiewicz. Designing ethical phishing experiments: a study of (ROT13) rOnl query features. In Proc. of the 15th International Conference on World Wide Web, 2006, pp. 513-522.
39. E. Bursztein, B. Benko, D. Margolis, T. Pietraszek, A. Archer, A. Aquino, A. Pitsillidis, S. Savage. Handcrafted fraud and extortion: Manual account hijacking in the wild. In Proc. of the 2014 Conference on Internet Measurement, 2014, pp. 347-358.
40. W. Hu, Y. Tan. Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN. arXiv:1702.05983, 2017.
41. M. Kawai, K. Ota, M. Dong. Improved MalGAN: Avoiding Malware Detector by Leaning Cleanware Features. In Proc. of the 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), 2019, pp. 40 - 45.
42. Hitaj, P. Gasti, G. Ateniese, F. Perez-Cruz. PassGAN: A Deep Learning Approach for Password Guessing. arXiv:1709.00440, 2017.
43. Narayanan, V. Shmatikov. Fast dictionary attacks on passwords using time-space tradeoff. In Proc. of the 12th ACM Conference on Computer and Communications Security, 2005, pp. 364 - 372.
44. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. Courville. Improved training of wasserstein GANs. In Proc. of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 5769-5779.
45. Kingma, J. Ba. Adam: A Method for Stochastic Optimization. arXiv:1412.6980, 2017.
46. K. He, X. Zhang, S. Ren, J. Sun. Deep Residual Learning for Image Recognition. arXiv:1512.03385, 2015.
47. Hashcat – advanced password recovery. Accessed: Oct. 19, 2019. [Online]. Available at: https://hashcat.net/hashcat/
48. John the Ripper password cracker. Accessed: Oct. 19, 2019. [Online]. Available at: https://www.openwall.com/john/
49. M. Weir, S. Aggarwal, B. Medeiros, BGlodek. Password cracking using probabilistic context-free grammars. In Proc. of the 30th IEEE Symposium on Security and Privacy, 2009, pp. 391-405.
50. M. Dürmuth, F. Angelstorf, C. Castelluccia, D. Perito, C. Abdelber. OMEN: Faster Password Guessing Using an Ordered Markov Enumerator. Lecture Notes in Computer Science, vol. 8978, 2015, pp. 119-132.
51. hashcat/rules/best64.rule. Accessed: Nov. 10, 2019 [Online]. Available at: https://github.com/hashcat/hashcat/blob/master/rules/best64.rule.
52. Derek Manky. Fortinet Predicts Highly Destructive and Self-learning “Swarm” Cyberattacks in 2018. Accessed: Nov. 10, 2019 [Online]. Available at: https://www.fortinet.com/corporate/about-us/newsroom/press-releases/2017/predicts-self-learning-swarm-cyberattacks-2018.html.
53. S. Sivakorn, J. Polakis, A.Keromytis. I’m not a human: Breaking the Google reCAPTCHA. Accessed: Nov. 10, 2019 [Online]. Available at: https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf.
54. L. Von Ahn, B. Maurer, C. McMillen, D. Abraham, and M. Blum. reCAPTCHA: Human-based character recognition via web security measures. Science, vol. 321, no. 5895, 2008, pp. 1465-1468.
55. A. Krizhevsky, I. Sutskever, G. Hinton. ImageNet classification with deep convolutional neural networks. Communications of the ACM, June 2017, vol. 60, issue 6, pp. 84-90.
56. Clarifia. Accessed: Nov. 10, 2019 [Online]. Available at: https://www.clarifai.com.
57. M. Zeiler, G. Taylor, Rob Fergus. Adaptive deconvolutional networks for mid and high level feature learning. In Proc. of the International Conference on Computer Vision, 2011, pp. 2018-2025.
58. Toronto Deep Learning Demos, Accessed: Nov. 10, 2019 [Online]. Available at: http://deeplearning.cs.toronto.edu.
59. N. Srivastava, R. Salakhutdinov. Multimodal Learning with Deep Boltzmann Machines. Journal of Machine Learning Research, vol. 15, 2014, pp. 2949-2980
60. Andrej Karpathy. Deep Visual-Semantic Alignments for Generating Image Descriptions. Accessed: Nov. 10, 2019 [Online]. Available at: https://cs.stanford.edu/people/karpathy/cvpr2015.pdf.
61. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv:1408.5093, 2014.
62. Shodan. Accessed: Oct. 19, 2019 [Online]. Available at: https://www.shodan.io/.
63. Angr. Accessed: Oct. 19, 2019 [Online]. Available at: https://angr.io/.
64. The Next Paradigm Shift AI-Driven Cyber-Attacks. Accessed: Oct. 19, 2019 [Online]. Available at: https://www.darktrace.com/en/resources/wp-ai-driven-cyber-attacks.pdf.
Рецензия
Для цитирования:
Авдошин С.М., Лазаренко А.В., Чичилева Н.И., Наумов П.А., Ключарев П.Г. Примеры использования машинного обучения в кибербезопасности. Труды Института системного программирования РАН. 2019;31(5):191-202. https://doi.org/10.15514/ISPRAS-2019-31(5)-15
For citation:
Avdoshin S., Lazarenko A., Chichileva N., Naumov P., Klyucharev P. Machine Learning Use Cases in Cybersecurity. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2019;31(5):191-202. (In Russ.) https://doi.org/10.15514/ISPRAS-2019-31(5)-15