Preview

Труды Института системного программирования РАН

Расширенный поиск

Обзор методов классификации сетевого трафика с использованием машинного обучения

https://doi.org/10.15514/ISPRAS-2020-32(6)-11

Полный текст:

Аннотация

В статье рассматривается задача классификации сетевого трафика с использованием методов машинного обучения. Приводятся различные постановки задачи, описываются ограничения использовавшихся ранее методов и причины использования машинного обучения в данной области. Рассматриваются различные алгоритмы машинного обучения, которые могут использоваться для решения задачи, указываются их преимущества и недостатки. Исследуется вопрос отбора признаков для классификации и проблема получения данных для обучения, основные компромиссы в этом вопросе. Перечисляются часто используемые наборы данных и их характеристики.  Завершается обзор описанием актуальных проблем в данной области: обучение и сравнение моделей, защита данных пользователей, изменчивость трафика.

Об авторах

Александр Игоревич ГЕТЬМАН
Институт системного программирования им. В.П. Иванникова РАН, Национальный исследовательский университет «Высшая школа экономики»,
Россия
Кандидат физико-математических наук, старший научный сотрудник ИСП РАН, доцент ВШЭ


Мария Кирилловна ИКОННИКОВА
Институт системного программирования им. В.П. Иванникова РАН
Россия
Аспирант


Список литературы

1. Rezaei S., Liu X. Deep learning for encrypted traffic classification: An overview. IEEE Communications Magazine, vol. 57, issue 5, 2019, pp. 76-81.

2. Jamshidi S. The Applications of Machine Learning Techniques in Networking. Available at: https://www.cs.uoregon.edu/Reports/AREA-201902-Jamshidi.pdf, accessed 30.10.2020.

3. Hubballi N., Swarnkar M. BitCoding: Network Traffic Classification Through Encoded Bit Level Signatures. IEEE/ACM Transactions on Networking, vol. 26, issue 5, 2018, pp. 1-13.

4. Hubballi N., Swarnkar M., Conti M. BitProb: Probabilistic Bit Signatures for Accurate Application Identification. IEEE Transactions on Network and Service Management, vol. 17, no. 3, 2020, pp. 1730-1741.

5. Finamore A., Mellia M., Meo M., Rossi D. KISS: Stochastic Packet Inspection Classifier for UDP Traffic. IEEE/ACM Transactions on Networking, vol. 18, no. 5, 2010, pp. 1505-1515.

6. Dorfinger P., Panholzer G., John W. Entropy estimation for real-time encrypted traffic identification. In Proc. of the Third international conference on Traffic monitoring and analysis (TMA'11), 2011, pp. 164-171.

7. Khakpour A.R., Liu A.X. High-Speed Flow Nature Identification. In Proc. of the 29th IEEE International Conference on Distributed Computing Systems, 2009, pp. 510-517.

8. Doroud H., Aceto G. et al. Speeding-Up DPI Traffic Classification with Chaining. In Proc. of the IEEE Global Communications Conference (GLOBECOM), 2018, pp. 1-6.

9. Vu L., Bui C.T., Nguyen Q.U. A Deep Learning Based Method for Handling Imbalanced Problem in Network Traffic Classification. In Proc. of the Eighth International Symposium on Information and Communication Technology, 2017, pp. 333-339.

10. Oudah H., Ghita B., Bakhshi T. A Novel Features Set for Internet Traffic Classification using Burstiness. In Proc. of the 5th International Conference on Information Systems Security and Privacy, vol. 1, 2019, pp. 397-404.

11. Aceto G., Ciuonzo D., Montieri A., Pescapé A. Multi-classification approaches for classifying mobile app traffic. Journal of Network and Computer Applications, vol. 103, 2018, pp. 131-145.

12. Lotfollahi M., Jafari Siavoshani M., Shirali Hossein Zade R. et al. Deep packet: a novel approach for encrypted traffic classification using deep learning. Soft Computing, vol. 24, issue 3, 2020, pp. 1999-2012.

13. Gómez S.E., Martínez B.C. et al. Ensemble network traffic classification: Algorithm comparison and novel ensemble scheme proposal. Computer Networks, vol. 127, 2017, pp. 68-80.

14. Lopez-Martin M., Carro B., Sanchez-Esguevillas A., Lloret J. Network Traffic Classifier with Convolutional and Recurrent Neural Networks for Internet of Things. IEEE Access, vol. 5, 2017, pp. 18042-18050.

15. Mercaldo N., Lu W. Classification of Web Applications Using AiFlow Features. In Proc. of the Workshops of the International Conference on Advanced Information Networking and Applications, 2020, pp. 389-399.

16. Wang P., Chen X., Ye F., and Sun Z. A survey of techniques for mobile service encrypted traffic classification using deep learning. IEEE Access, vol. 7, 2019, pp. 54024-54033.

17. Takyi K., Bagga A., Gupta P. A Semi-Supervised QoS-Aware Classification for Wide Area Networks with Limited Resources. International Journal of Innovative Technology and Exploring Engineering, vol. 8, issue 11, 2019, pp. 970-981.

18. Li W., Moore A. W. A Machine Learning Approach for Efficient Traffic Classification. In Proc. of the 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, 2007, pp. 310-317.

19. Ding Y. A method of imbalanced traffic classification based on ensemble learning. In Proc. of the IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), 2015, pp. 1-4.

20. Carela-Español V. et al. Analysis of the impact of sampling on NetFlow traffic classification. Computer Networks, vol. 55, issue 5, 2011, pp. 1083-1099.

21. Oudah H., Ghita B., Bakhshi T. Network application detection using traffic burstiness. In Proc. of the World Congress on Internet Security, 2017, pp. 148-152.

22. Soylu T., Erdem O., Carus A. Bit vector-coded simple CART structure for low latency traffic classification on FPGAs. Computer Networks, 2020, vol. 167, article id 106977.

23. Wang W., Zhu M., Wang J., Zeng X., Yang Z. End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In Proc. of the IEEE International Conference on Intelligence and Security Informatics (ISI), 2017, pp. 43-48.

24. Zhao S., Chen S. et al. Identifying Known and Unknown Mobile Application Traffic Using a Multilevel Classifier. Security and Communication Networks, vol. 2019, 2019, article id 9595081.

25. Brissaud P., Francçis J., Chrisment I., Cholez T., Bettan O. Transparent and Service-Agnostic Monitoring of Encrypted Web Traffic. IEEE Transactions on Network and Service Management, vol. 16, no. 3, 2019, pp. 842-856.

26. Jin Y., Duffield N. et al. A modular machine learning system for flow-level traffic classification in large networks. ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 6, issue 1, 2012, pp. 1-34.

27. Rezaei S., Kroencke B., Liu X. Large-Scale Mobile App Identification Using Deep Learning. IEEE Access, vol. 8, 2020, pp. 348-362.

28. Hasibi R., Shokri M., Dehghan M. Augmentation scheme for dealing with imbalanced network traffic classification using deep learning. arXiv preprint, arXiv:1901.00204, 2019.

29. Zhao L. et al. A novel network traffic classification approach via discriminative feature learning. In Proc. of the 35th Annual ACM Symposium on Applied Computing, 2020, pp. 1026-1033.

30. Rezaei S., Liu X. How to achieve high classification accuracy with just a few labels: A semi-supervised approach using sampled packets. arXiv preprint, arXiv:1812.09761, 2018.

31. Wang Z. The applications of deep learning on traffic identification. BlackHat USA, vol. 24, issue 11, 2015, pp. 1-10.

32. Zheng W., Gou C., Yan L., Mo S. Learning to Classify: A Flow-Based Relation Network for Encrypted Traffic Classification. In Proc. of the Web Conference, 2020, pp. 13-22.

33. Zeng Y., Qi Z. et al. TEST: an End-to-End Network Traffic Examination and Identification Framework Based on Spatio-Temporal Features Extraction. arXiv preprint, arXiv:1908.10271, 2019.

34. De Montigny-Leboeuf A. Flow attributes for use in traffic characterization. Technical Note CRC-TN-2005-00, Communications Research Centre Canada, 2005.

35. Burschka S., Dupasquier B. Tranalyzer: Versatile high performance network traffic analyser. IEEE Symposium Series on Computational Intelligence (SSCI), 2016, pp. 1-8.

36. Orebaugh A., Ramirez G., Beale J. Wireshark & Ethereal network protocol analyzer toolkit. Elsevier, 2006, 448 p.

37. Deri L. et al. ndpi: Open-source high-speed deep packet inspection. In Proc. of the IEEE International Wireless Communications and Mobile Computing Conference (IWCMC), 2014, pp. 617-622.

38. Carela-Español V., Bujlow T., Barlet-Ros P. Is our ground-truth for traffic classification reliable? Lecture Notes in Computer Science, vol. 8362, 2014, pp. 98-108.

39. Bujlow T., Carela-Español V., Barlet-Ros P. Independent comparison of popular DPI tools for traffic classification. Computer Networks, vol. 76, 2015, pp. 75-89.

40. Khatouni A. S., Heywood N. Z. How much training data is enough to move a ML-based classifier to a different network? Procedia Computer Science, vol. 155, 2019, pp. 378-385.

41. Fan J., Xu J., Ammar M. H., Sue M. Prefix-Preserving IP Address Anonymization: Measurement-Based Security Evaluation and a New Cryptography-Based Scheme. In Proc. of the 10th IEEE International Conference on Network Protocols (ICNP 2002), 2002, pp. 12-15.

42. Draper-Gil G., Lashkari A.H., Mamun M.S.I., Ghorbani A.A. Characterization of encrypted and VPN traffic using time-related. In Proc. of the 2nd International Conference on Information Systems Security and Privacy (ICISSP), 2016, pp. 407-414.


Рецензия

Для цитирования:


ГЕТЬМАН А.И., ИКОННИКОВА М.К. Обзор методов классификации сетевого трафика с использованием машинного обучения. Труды Института системного программирования РАН. 2020;32(6):137-154. https://doi.org/10.15514/ISPRAS-2020-32(6)-11

For citation:


GETMAN A.I., IKONNIKOVA M.K. A Survey of Network Traffic Classification. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2020;32(6):137-154. (In Russ.) https://doi.org/10.15514/ISPRAS-2020-32(6)-11



Creative Commons License
Контент доступен под лицензией Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)