Определение аккаунтов злоумышленников в социальной сети ВКонтакте при помощи методов машинного обучения

Денис Игоревич САМОХВАЛОВ

doi:10.15514/ISPRAS-2020-32(3)-10

Определение аккаунтов злоумышленников в социальной сети ВКонтакте при помощи методов машинного обучения

Денис Игоревич САМОХВАЛОВ

https://doi.org/10.15514/ISPRAS-2020-32(3)-10

Полный текст:

PDF (Eng)

сгенерировать QR код

Аннотация

В данной работе представлен подход для обнаружения аккаунтов злоумышленников в крупнейшей российской социальной сети ВКонтакте на основе методов машинного обучения. Был проведен исследовательский анализ данных для определения аномалий и закономерностей в наборе данных, состоящем из 42394 вредоносных и 241035 подлинных учетных записей пользователей ВКонтакте. Кроме того, для получения набора данных был разработан инструмент для автоматического сбора информации о вредоносных аккаунтах в социальной сети ВКонтакте, описание архитектуры данного инструмента приведено в работе. На основе признаков, сгенерированных из пользовательских данных, была обучена модель классификации при помощи библиотеки CatBoost. Результаты показали, что эта модель может идентифицировать злоумышленников с общим качеством AUC 0.91, подтвержденной четырехкратным методом перекрестной проверки.

Ключевые слова

ВКонтакте, злоумышленники, машинное обучение, социальные сети, модели классификации, анализ данных

Об авторе

Денис Игоревич САМОХВАЛОВ

Национальный исследовательский университет «Высшая школа экономики»
Россия

Студент магистратуры

Список литературы

1. J.A. Obar and S.S. Wildman. Social Media Definition and the Governance Challenge: An Introduction to the Special Issue. Telecommunications Policy, vol. 39, no. 9, 2915, pp. 745-750

2. D. M. Romero, W. Galuba, S. Asur, and B. A. Huberman. Influence and passivity in social media. In Proc. of the 20th International Conference Companion on World wide web, 2011, pp. 113-114.

3. Дубль [1] J. A. Obar and S. Wildman, “Social media definition and the governance challenge: An introduction to the special issue,” Telecommunications Policy, vol. 39, no. 9, pp. 745–750, Oct. 2015. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0308596115001172

4. I. Shatilin. What are virtual SIM cards and what do they do? Available at: https://www.kaspersky.com/blog/virtual-sim/11572/.

5. K. S. Adewole, N. B. Anuar, A. Kamsin, K. D. Varathan, and S. A. Razak. Malicious accounts: Dark of the social networks. Journal of Network and Computer Applications, vol. 79, 2017, pp. 41–67.

6. A. V. Filimonov, A. V. Osipov, and A. B. Klimov. Application of neural networks to identify trolls in social networks. arXiv:1504.07416 [cs], Apr. 2015.

7. A. Malm, R. Nash, and R. Moghadam. Social Network Analysis and Terrorism. In Handbook of the Criminology of Terrorism, G. LaFree and J. D. Freilich, eds., John Wiley & Sons, Inc., 2017, pp. 221–231.

8. Z. Mao, D. Li, Y. Yang, X. Fu, and W. Yang. Chinese DMOs’ engagement on global social media: examining post-related factors. Asia Pacific Journal of Tourism Research, vol. 25, no. 3, pp. 274–285.

9. D. DeBarr and H. Wechsler. Using Social Network Analysis for Spam Detection. Lecture Notes in Computer Science, 2010, vol. 6007, pp. 62–69.

10. L. Wu and H. Liu. Detecting Crowdturfing in Social Media. In Encyclopedia of Social Network Analysis and Mining, R. Alhajj and J. Rokne, eds, Springer, 2017, pp. 1–9.

11. M. Fire, D. Kagan, A. Elyashar, and Y. Elovici. Friend or foe? Fake profile identification in online social networks. Social Network Analysis and Mining, vol. 4, 2014, Article no. 194

12. T. Stein, E. Chen, and K. Mangla. Facebook immune system. In Proc. of the 4th Workshop on Social Network Systems, 2011, article no. 8, pp, 1–8 pp. 1–8.

13. S. Ali, N. Islam, A. Rauf, I. Din, M. Guizani, and J. Rodrigues. Privacy and Security Issues in Online Social Networks. Future Internet, vol. 10, no. 12, 2018, article no. 114, pp. 1-12.

14. M. Conti, R. Poovendran, and M. Secchiero. FakeBook: Detecting Fake Profiles in On-Line Social Networks. In Proc. of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012, pp. 1071–1078.

15. A.J. Banu, N.N. Ahamed, B. Manivannan, K. Vanitha, M.M. Musthafa. Detecting Spammers on Social Networks. International Journal of Engineering and Computer Science, vol. 6, issue 2, 2017, pp. 20240-20247.

16. A. Romanov, A. Semenov, and J. Veijalainen. Revealing Fake Profiles in Social Networks by Longitudinal Data Analysis. In Proc. of the 13th International Conference on Web Information Systems and Technologies., 2017, pp. 51–58. 8

17. S. Adikari and K. Dutta. Identifying fake profiles in linkedin. In Proc. of the 19th Pacific Asia Conference on Information Systems, 2014, article no. 278.

18. Q. Cao, M. Sirivianos, X. Yang, and T. Pregueiro. Aiding the detection of fake accounts in large scale social online services. In Proc. of the 9th USENIX Conference on Networked Systems Design and Implementation, 2012, pp. 197–210.

19. S. Y. Wani, M. M. Kirmani, and S. I. Ansarulla. Prediction of fake profiles on facebook using supervised machine learning techniques-a theoretical model. International Journal of Computer Science and Information Technologies, vol. 7, no. 4, 2016, pp. 1735–1738.

20. M. Albayati and A. Altamimi. MDFP: A Machine Learning Model for Detecting Fake Facebook Profiles Using Supervised and Unsupervised Mining Techniques. International Journal of Simulation: Systems, Science & Technology, vol. 20, no. 1, 2019, article no. 11, pp. 1-10.

21. S. Khaled, N. El-Tazi, and H.M.O. Mokhtar. Detecting Fake Accounts on Social Media. In Proc. of the 2018 IEEE International Conference on Big Data (Big Data), 2018, pp. 3672–3681.

22. C. Troussas, M. Virvou, K. J. Espinosa, K. Llaguno, and J. Caro. Sentiment analysis of facebook statuses using naive bayes classifier for language learning. In Proc. of the International Conference on Information, Intelligence, Systems and Applications, 2013, pp. 1–6.

23. P.D. Zegzhda, E.V. Malyshev, and E.Y. Pavlenko. The use of an artificial neural network to detect automatically managed accounts in social networks. Automatic Control and Computer Sciences, vol. 51, no. 8, 2017, pp. 874–880.

24. K. Skorniakov, D. Turdakov, and A. Zhabotinsky. Make social networks clean again: Graph embedding and stacking classifiers for bot detection. In Proc. of the 2nd International Workshop on Rumours and Deception in Social Media, 2018, paper 39.

25. O. Varol, E. Ferrara, C. A. Davis, F. Menczer, and A. Flammini. Online human-bot interactions: Detection, estimation, and characterization. arXiv:1703.03107, 2017.

26. F. Morstatter, L. Wu, T. H. Nazer, K. M. Carley, and H. Liu. A new approach to bot detection: Striking the balance between precision and recal. In Proc. of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2016, pp. 533–540.

27. MongoDB, 2020. Available at: https://www.mongodb.com/

28. Docker, 2020. Available at: https://www.docker.com/

29. DigitalOcean, 2020. Available at: https://www.digitalocean.com/

30. L. Prokhorenkova, G. Gusev, A. Vorobev, A.V. Dorogush, and A. Gulin. Catboost: unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing, 2018, pp. 6638–6648.

31. Microleaves, 2020. Available at: https://microleaves.com/

Рецензия

Для цитирования:

САМОХВАЛОВ Д.И. Определение аккаунтов злоумышленников в социальной сети ВКонтакте при помощи методов машинного обучения. Труды Института системного программирования РАН. 2020;32(3):109-117. https://doi.org/10.15514/ISPRAS-2020-32(3)-10

For citation:

SAMOKHVALOV D.I. Machine Learning-Based malicious users’ detection in the VKontakte social network. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2020;32(3):109-117. https://doi.org/10.15514/ISPRAS-2020-32(3)-10

Контент доступен под лицензией Creative Commons Attribution 4.0 License.

ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)

Логин
Пароль
	Запомнить меня
Регистрация нового пользователя Забыли Ваш пароль?

Войти

Труды Института системного программирования РАН

Определение аккаунтов злоумышленников в социальной сети ВКонтакте при помощи методов машинного обучения

Полный текст:

Аннотация

Ключевые слова

Об авторе

Список литературы

Рецензия

Для цитирования:

For citation:

Использование куки-файлов