Machine Learning Use Cases in Cybersecurity
https://doi.org/10.15514/ISPRAS-2019-31(5)-15
Abstract
The problem regarding the use of machine learning in cybersecurity is difficult to solve because the advances in the field offer many opportunities that it is challenging to find exceptional and beneficial use cases for implementation and decision making. Moreover, such technologies can be used by intruders to attack computer systems. The goal of this paper to explore machine learning usage in cybersecurity and cyberattack and provide a model of machine learning-powered attack.
About the Authors
Sergey AvdoshinNational Research University Higher School of Economics
Russian Federation
Candidate of Technical Science, Professor, Head of the School of Software Engineering
Aleksandr Lazarenko
Russian Federation
Head of R&D department
Nataliya Chichileva
Russian Federation
Junior specialist of R&D department
Pavel Naumov
Russian Federation
Junior specialist of R&D department
Peter Klyucharev
Russian Federation
Candidate of Technical Science, Associate Professor of the Department of Information Security
References
1. P. Krensky, J. Hare, “Hype Cycle for Data Science and Machine Learning, 2018”, Gartner, 2018. Accessed: Sep. 10, 2019. [Online] Available: https://www.gartner.com/en/documents/3883664/hype-cycle-for-data-science-and-machine-learning-2018
2. Nils J. Nilsson, “Artificial Intelligence: A New Synthesis”, Elsevier Inc, 1998.
3. “Businesses recognize the need for AI & ML tools in cybersecurity”, Helpnetsecurity.com, Accessed: Sep. 10, 2019. [Online] Available:
4. https://www.helpnetsecurity.com/2019/03/14/ai-ml-tools-cybersecurity/
5. T. Mitchell, “Machine Learning”, Springer Science & Business Media, 1986.
6. J. Grus, “Data Science from Scratch: First Principles with Python”, O'Reilly Media, 2015.
7. L. Deng, D. Yu, "Deep Learning: Methods and Applications", Foundations and Trends in Signal Processing, vol. 7, nos. 3–4, pp. 199- 200
8. K. Warr, “Strengthening Deep Neural Networks: Making AI Less Susceptible to Adversarial Trickery”, "O'Reilly Media, Inc.", Jul 3, 2019, pp. 1 - 3.
9. T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, "Improved Techniques for Training GANs", in ArXiv, 2016. Accessed: Nov. 5, 2019. [Online]. Available: https://arxiv.org/pdf/1606.03498.pdf
10. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S.Ozair, A. Courville, Y. Bengio, "Generative Adversarial Networks", in ArXiv, 2014. Accessed: Nov. 5, 2019. [Online]. Available: https://arxiv.org/pdf/1406.2661.pdf
11. J. Han, J. Pei, M. Kamber, “Data Mining: Concepts and Techniques”, Elsevier, 2011, pp. 1 - 38.
12. P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, R. Wirth, . “CRISPDM 1.0 step-by-step data mining guide”, SPSS, 2000.
13. S. Dilek, H. Çakır, M. Aydın, “Applications Of Artificial Intelligence Techniques To Combating Cyber Crimes: A Review”, International Journal of Artificial Intelligence & Applications (IJAIA), vol. 6, vo. 1, Jan. 2015, pp. 21 - 39.
14. S. Revathi and A. Malathi, “A Detailed Analysis on NSL-KDD Dataset Using Various Machine Learning Techniques for Intrusion Detection,” International Journal of Engineering Research and Technology, vol. 2, iss. 12, Dec. 2013.
15. A. L. Buczak and E. Guven, “A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection,” IEEE Commun. Surv. Tutor., vol. 18, no. 2, 2016, pp. 1153–1176.
16. W. Melicher, B. Ur, S.Segreti, S. Komanduri, L. Bauer, N. Christin,L. Cranor, “Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks”, in Proceedings of the 25th USENIX Security Symposium, Aug. 10-12, 2016. Accessed: Oct. 19, 2019. [Online]. Available: https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_melicher.pdf
17. A. Ciaramella, P. D’Arco, A. De Santis, C. Galdi, R. Tagliaferri, “Neural Network Techniques for Proactive Password Checking”, IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, vol. 3, no. 4, 2006, pp. 327 - 339.
18. C. Brook, “What is User and Entity Behavior Analytics? A Definition of UEBA, Benefits, How It Works, and Mor”, digitalguardian.com, Dec. 5, 2018. Accessed: Oct. 10, 2019. [Online]. Available:https://digitalguardian.com/blog/what-user-and-entity-behavior-analytics-definition-ueba-benefits-how-it-works-and-more
19. A. Buczak, E. Guven, “A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection”, IEEE communications surveys & tutorials, vol. 18, no. 2, 2016, pp. 1153 - 1176.
20. E. Kaspersky, “Laziness, Cybersecurity, and Machine Learning”, eugene.kaspersky.com, Sep. 26, 2016. Accessed: Oct. 10, 2019. [Online]. Available:
21. https://eugene.kaspersky.com/2016/09/26/laziness-cybersecurity-and-machine-learning/
22. J. Roberts, “Cyber-Hunting at Scale (CHASE)”, darpa.mil/. Accessed: Oct. 19, 2019. [Online]. Available: https://www.darpa.mil/program/cyber-hunting-at-scale
23. A. Hernandez-Suarez, G. Sanchez-Perez, K. Toscano-Medina, V. Martinez-Hernandez, H. Perez-Meana, J. Olivares-Mercado, V. Sanchez, “ Social Sentiment Sensor in Twitter for Predicting Cyber-Attacks Using ℓ1 Regularization”, Sensors, vol. 18, no. 5, 2018, pp. 1380.
24. A. Caliskan, F. Yamaguchi, E. Dauber, R. Harang, K. Rieck, R. Greenstadt, A/ Narayanan, “De-anonymizing Programmers via Code Stylometry”, in Proceedings of the 24th USENIX Security Symposium, Washington, DC, USA, 2015, pp. 255 - 270.
25. A. Caliskan, F. Yamaguchi, E. Dauber, R. Harang, K. Rieck, R. Greenstadt, A. Narayanan, “When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries”, in ArXiv, 2017. Accessed: Oct. 10, 2019. [Online]. Available: https://arxiv.org/pdf/1512.08546.pdf
26. S. Repalle, V. Kolluru, “Intrusion Detection System using AI and Machine Learning Algorithm”, International Research Journal of Engineering and Technology (IRJET), vol. 04, iss. 12, December, 2017, pp. 1709 - 1715.
27. J. Vacca, S. Ellis, “Firewalls: Jumpstart for Network and Systems Administrators”, in Elsevier, 2004.
28. E. Ucar, E. Ozhan, “The Analysis of Firewall Policy Through Machine Learning and Data Mining”, Wireless Personal Communications, vol. 96, iss. 2, 2017, pp. 2891 - 2909.
29. S. Prandl, M. Lazarescu, D. Pham, “A Study of Web Application Firewall Solutions”, Information Systems Security, 2015, pp. 501 - 510.
30. “Introduction to Forcepoint DLP Machine Learning”, websense.com, 2018. Accessed: Oct. 10, 2019. [Online]. Available: https://www.websense.com/content/support/library/data/v84/machine_learning/machine_learning.pdf
31. “OWASP Top 10 - 2017 The Ten Most Critical Web Application Security Risks”, OWASP, 2017. Accessed: Nov. 5, 2019. [Online]. Available: https://www.owasp.org/images/7/72/OWASP_Top_10-2017_%28en%29.pdf.pdf
32. S. Calzavara, M. Conti, R. Focardi, A. Rabitti, G. Tolomei, “Mitch: A Machine Learning Approach to the Black-Box Detection of CSRF Vulnerabilities”, in 2019 IEEE European Symposium on Security and Privacy (EuroS&P), 2019.
33. G. Pellegrino, M. Johns, S. Koch, M. Backes, C. Rossow, “Deemon: Detecting CSRF with Dynamic Analysis and Property Graphs” in ArXiv, 2017. Accessed: Nov. 5, 2019. [Online]. Available: https://arxiv.org/pdf/1708.08786.pdf
34. Z. Mao, N. Li, I. Molloy, “Defeating Cross-Site Request Forgery Attacks with Browser-Enforced Authenticity Protection”, in Financial Cryptography and Data Security, 2009, pp. 238 - 255.
35. Philippe De Ryck, Lieven Desmet, Thomas Heyman, Frank Piessens, “CsFire: Transparent Client-Side Mitigation of Malicious Cross-Domain Requests”, in Engineering Secure Software and Systems, Second International Symposium, ESSoS 2010, Pisa, Italy, February 3-4, 2010, pp. 18 - 34.
36. Jacob Wilkin, “Mapping Social Media with Facial Recognition: A New Tool for Penetration Testers and Red Teamers”, Trustwave.com, Aug. 08, 2018. Accessed: Oct. 19, 2019. [Online]. Available: https://www.trustwave.com/en-us/resources/blogs/spiderlabs-blog/mapping-social-media-with-facial-recognition-a-new-tool-for-penetration-testers-and-red-teamers/
37. R. Zellers, A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, Y. Choi, “Defending Against Neural Fake News”, in ArXiv, 2019. Accessed: Oct. 19, 2019. [Online]. Available: https://arxiv.org/abs/1905.12616
38. J. Seymour, P. Tully, “Weaponizing data science for social engineering: Automated E2E spear phishing on Twitter”. Accessed: Oct. 19, 2019. [Online]. Available: https://www.blackhat.com/docs/us-16/materials/us-16-Seymour-Tully-Weaponizing-Data-Science-For-Social-Engineering-Automated-E2E-Spear-Phishing-On-Twitter-wp.pdf
39. S. Thompson, "Phight Phraud”, in Journal of Accountancy, Accessed: Nov. 6, 2019. [Online]. Available: https://www.journalofaccountancy.com/issues/2006/feb/phightphraud.html
40. M. Jakobsson, J. Ratkiewicz. "Designing ethical phishing experiments: a study of (ROT13) rOnl query features", Proceedings of the 15th international conference on World Wide Web, 2006.
41. E. Bursztein, B. Benko, D. Margolis, T. Pietraszek, A. Archer, A. Aquino, A. Pitsillidis, S. Savage, "Handcrafted fraud and extortion: Manual account hijacking in the wild", Proceedings of the 2014 Conference on Internet Measurement Conference, 2014.
42. W. Hu, Y. Tan, “Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN”, in ArXiv, 2017. Accessed: Oct. 19, 2019. [Online]. Available: https://arxiv.org/abs/1702.05983
43. M. Kawai, K. Ota, M. Dong, "Improved MalGAN: Avoiding Malware Detector by Leaning Cleanware Features," 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Okinawa, Japan, 2019, pp. 40 - 45.
44. B. Hitaj, P. Gasti, G. Ateniese, F. Perez-Cruz, “PassGAN: A Deep Learning Approach for Password Guessing”, in ArXiv, 2017. Accessed: Oct. 19, 2019. [Online]. Available: https://arxiv.org/pdf/1709.00440.pdf
45. A. Narayanan, V. Shmatikov, “Fast dictionary attacks on passwords using time-space tradeoff”, in Proceedings of the 12th ACM Conference on Computer and Communications Security, CCS 2005, Alexandria, VA, USA, November 7-11, 2005, pp. 364 - 372.
46. I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. Courville, “Improved training of wasserstein GANs”, in NIPS'17 Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, California, USA, December 04 - 09, 2017, pp. 5769 - 5779.
47. D. Kingma, J. Ba, “Adam: A Method For Stochastic Optimization”, in ArXiv, 2017. Accessed: Nov. 6, 2019. [Online]. Available: https://arxiv.org/pdf/1412.6980v9.pdf
48. K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition” in ArXiv, 2015. Accessed: Nov. 10, 2019. [Online]. Available:https://arxiv.org/abs/1512.03385
49. Hashcat - advanced password recovery. Accessed: Oct. 19, 2019. [Online]. Available: https://hashcat.net/hashcat/
50. John the Ripper password cracker. Accessed: Oct. 19, 2019. [Online]. Available: https://www.openwall.com/john/
51. M. Weir, S. Aggarwal, B. Medeiros, BGlodek, “Password cracking using probabilistic context-free grammars”, in 30th IEEE Symposium on Security and Privacy. IEEE, 2009, pp. 391–405.
52. M. Dürmuth, F. Angelstorf, C. Castelluccia, D. Perito, C. Abdelber, “OMEN: Faster Password Guessing Using an Ordered Markov Enumerator”, in ESSoS, Springer, 2015, pp. 119–132.
53. “hashcat/rules/best64.rule”, GitHub, Accessed: Nov. 10, 2019 [Online]. Available: https://github.com/hashcat/hashcat/blob/master/rules/best64.rule
54. D. Manky, “Fortinet Predicts Highly Destructive and Self-learning “Swarm” Cyberattacks in 2018”, Fortinet.com, Nov. 14, 2017. Accessed: Nov. 10, 2019 [Online]. Available:https://www.fortinet.com/corporate/about-us/newsroom/press-releases/2017/predicts-self-learning-swarm-cyberattacks-2018.html
55. S. Sivakorn, J. Polakis, A.Keromytis, “I’m not a human: Breaking the Google reCAPTCHA”, 2016. Accessed: Nov. 10, 2019 [Online]. Available:https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf
56. L. Von Ahn, B. Maurer, C. McMillen, D. Abraham, and M. Blum, “reCAPTCHA: Human-based character recognition via web security measures,”Science, vol. 321, no. 5895, 2008.
57. A. Krizhevsky, I. Sutskever, G. Hinton, “ImageNet classification with deep convolutional neural networks”, Communications of the ACM CACM, June 2017, vol. 60, iss. 6, pp. 84-90.
58. Clarifia. Accessed: Nov. 10, 2019 [Online]. Available: https://www.clarifai.com
59. M. Zeiler, G. Taylor, Rob Fergus, “Adaptive deconvolutional networks for mid and high level feature learning”, in International Conference on Computer Vision, Barcelona, Spain, November, 6-13, 2011.
60. Toronto Deep Learning Demos, Accessed: Nov. 10, 2019 [Online]. Available: http://deeplearning.cs.toronto.edu
61. N. Srivastava, R. Salakhutdinov, “Multimodal Learning with Deep Boltzmann Machines”, Journal of Machine Learning Research, vol. 15, 2014, pp. 2949-2980
62. A. Karpathy, “Deep Visual-Semantic Alignments for Generating Image Descriptions”, 2015, Accessed: Nov. 10, 2019 [Online]. Available: https://cs.stanford.edu/people/karpathy/cvpr2015.pdf
63. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, “Caffe: Convolutional Architecture for Fast Feature Embedding”, in ArXiv, 2014. Accessed: Nov. 10, 2019. [Online]. Available: https://arxiv.org/abs/1408.5093
64. “Shodan”, Accessed: Oct. 19, 2019 [Online]. Available: https://www.shodan.io/
65. “Angr”, Accessed: Oct. 19, 2019 [Online]. Available: https://angr.io/
66. “The Next Paradigm Shift AI-Driven Cyber-Attacks”, in Darktrace. Accessed: Oct. 19, 2019 [Online]. Available: https://www.darktrace.com/en/resources/wp-ai-driven-cyber-attacks.pdf
Review
For citations:
Avdoshin S., Lazarenko A., Chichileva N., Naumov P., Klyucharev P. Machine Learning Use Cases in Cybersecurity. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2019;31(5):191-202. (In Russ.) https://doi.org/10.15514/ISPRAS-2019-31(5)-15