Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Advanced supervised learning in multi-layer perceptrons to the recognition tasks based on correlation indicator

https://doi.org/10.15514/ISPRAS-2021-33(1)-2

Abstract

The article deals with the problem of recognition of handwritten digits using feedforward neural networks (perceptrons) using a correlation indicator. The proposed method is based on the mathematical model of the neural network as an oscillatory system similar to the information transmission system. The article uses theoretical developments of the authors to search for the global extremum of the error function in artificial neural networks. The handwritten digit image is considered as a one-dimensional input discrete signal representing a combination of "perfect digit writing" and noise, which describes the deviation of the input implementation from "perfect writing". The ideal observer criterion (Kotelnikov criterion), which is widely used in information transmission systems and describes the probability of correct recognition of the input signal, is used to form the loss function. In the article is carried out a comparative analysis of the convergence of learning and experimentally obtained sequences on the basis of the correlation indicator and widely used in the tasks of classification of the function CrossEntropyLoss with the use of the optimizer and without it. Based on the experiments carried out, it is concluded that the proposed correlation indicator has an advantage of 2-3 times.

About the Authors

Nikolay Anatolievich VERSHKOV
North-Caucasus Federal University
Russian Federation
Ph.D. in Engineering Sciences


Mikhail Grigoryevich BABENKO
North-Caucasus Federal University
Russian Federation
Ph.D. in Physics and Mathematics


Viktor Andreevich KUCHUKOV
North-Caucasus Federal University
Russian Federation
Research Assistant


Natalia Nikolaevna KUCHUKOVA
North-Caucasus Federal University
Russian Federation
Leading Specialist


References

1. Колмогоров А.Н. О представлении непрерывных функций нескольких переменных в виде суперпозиций непрерывных функций одного переменного и сложения, Доклады АН СССР, том 114, no. 5, 1957 г., стр. 953-956 / Kolmogorov A.N. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. American Mathematical Society Translations: Series 2, vol. 28, 1963, pp. 55-59.

2. Арнольд В.И. О представлении функций нескольких переменных в виде суперпозиции функций меньшего числа переменных. Математическое просвещение, вып. 3, 1958 г., стр. 41-61 / Arnol'd V.I. On the representation of functions of several variables as a superposition of functions of a smaller number of variables. In Vladimir I. Arnold - Collected Works, vol.1. Sringer, 2009, pp.

3. Hecht-Nielsen R. Neurocomputing. Addison-Wesley, 1989, 433 p.

4. Дзядык В.К. Введение в теорию равномерного приближения функций полиномами. М., Наука, 1977 г., 512 стр. / V.K. Dzyadyk. Introduction to the theory of the uniform approximation of functions by polynomials. Nauka, 1977, 512 p. (in Russian).

5. Hebb D.O. The Organization of Behavior. Wiley, 1949, 335 p.

6. Hinton G.E. Training Products of Experts by Minimizing Contrastive Divergence. Neural Computation, vol. 14, no. 8, 2002, pp.1771-1800.

7. Hinton G.E. Learning Multiple Layers of Representation. Trends in Cognitive Sciences, vol. 11, 2007, pp. 428-434.

8. Нужный А.С., Регуляризация Байеса при подборе весовых коэффициентов в ансамблях предикторов. Труды ИСП РАН, том 31, вып. 4, 2019 г., стр. 113-120 / Nuzhny A.S. Bayes regularization in the selection of weight coefficients in the predictor ensembles. Trudy ISP RAN/Proc. ISP RAS, vol. 31, issue 4, 2019. pp. 113-120. DOI: 10.15514/ISPRAS-2019-31(4)-7 (in Russian).

9. García-Hernández L.E., Barrios-Hernande C.J., Radchenko G. et al. Multi-objective Configuration of a Secured Distributed Cloud Data Storage. Communications in Computer and Information Science, vol. 1087, 2019, pp. 78-93.

10. Николенко С., Кадурин А., Архангельская Е. Глубокое обучение. СПб., Питер, 2018 г., 480 стр. / Nikolenko S., Kadurin A., Arhangel'skaya E. Deep Learning. Piter, 2018, 480 p. (in Russian).

11. Дорогов А.Ю. Реализация спектральных преобразовании в классе быстрых нейронных сетей. Программирование, том 29, no. 4, 2003 г., стр. 13-26 / Dorogov A.Y. Implementation of spectral transformations in the class of fast neural networks. Programming and Computer Software, vol. 29, no. 4, 2003, pp. 187–198.

12. Аджемов С.С. и др. Использование искусственных нейронных сетей для классификации источников сигналов в системах когнитивного радио. Программирование, том 42, no. 3, 2016 г., стр. 3-11 / Adjemov S.S. et al. The use of artificial neural networks for classification of signal sources in cognitive radio systems //Programming and Computer Software, vol. 42, no. 3, 2016, pp. 121–128.

13. Vershkov N.A., Kuchukov V.A., Kuchukova N.N., Babenko M. The Wave Model of Artificial Neural Network. In Proc. of the 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, EIConRus 2020, pp. 542-547.

14. Шеннон К. Работы по теории информации и кибернетике. М., Издательство иностранной литературы, 1963 г., 830 стр. / Shannon C. Works on information theory and cybernetics. Izdatel'stvo inostrannoj literatury, 1963, 830 p. (in Russian).

15. Сикарев А.А., Лебедев О.Н. Микроэлектронные устройства формирования и обработки сложных сигналов. М.: Издательство «Радио и связь», 1983 г., 213 стр. / Sikarev A.A., Lebedev O.N. Microelectronic devices for the generation and processing of complex signals. Izdatel'stvo «Radio i svyaz'», 1983, 213 p. (in Russian).

16. Widrow B. Adaptive sampled–data systems, a statistical theory of adaptation. IRE WESCON Convention Record, vol. 4, 1959, pp. 74-85.

17. Айфичер Э.С., Джервис Б.У. Цифровая обработка сигналов: практический подход, 2-е издание. Пер. с англ. М., Издательский дом «Вильямс», 2008 г., 992 стр. / E.C. Ifeachor, B.W. Jervis. Digital signal processing: a practical approach. Pearson Education, 2002, 933 p.

18. А.В. Солодов. Теория информации и ее применение к задачам автоматического управления и контроля. М.: Издательство «Наука» Главная редакция физико-математической литературы, 1967 / A.V. Solodov, “Information theory and its application to tasks of automatic control and monitoring”, Nauka, 1967, (in Russian).

19. Ерофеева В.А. Обзор теории интеллектуального анализа данных на базе нейронных сетей, Стохастическая оптимизация в информатике, 2015 г., том 11, no. 3, стр. 3-17 / Erofeeva V.A. An Overview of Data Mining Concepts Based on Neural Networks. Stohasticheskaya optimizaciya v informatike, vol. 11, no. 3, 2015, pp. 3-17 (in Russian).

20. Цыпкин Я.3. Информационная теория идентификации. М., Наука. Физматлит, 1995 г., 336 стр. / Tsypkin Ya.Z. Information theory of identification. Nauka. Fizmatlit, 1995, 336 p. (in Russian).

21. Вершков Н.А., Кучуков В.А., Кучукова Н.Н. Теоретический подход к поиску глобального экстремума при обучении нейронных сетей. Труды Института системного программирования РАН, том 31, вып. 2, 2019 г., стр. 41-52 / Vershkov N.N., Kuchukov V.A., Kuchukova N.N. The theoretical approach to the search for a global extremum in the training of neural networks. Trudy ISP RAN/Proc. ISP RAS, vol. 31, issue 2, 2019, pp. 41-52 (in Russian). DOI: 10.15514/ISPRAS-2019-31(2)-4.

22. Хайкин С. Нейронные сети: полный курс, 2-е издание. М., Издательский дом «Вильямс», 2006 г., 1104 стр. / Haykin S. Neural Networks: A Comprehensive Foundation. Prentice Hall, 1999, 842 p.

23. Линник Ю.В. Метод наименьших квадратов и основы математико-статистической теории обработки наблюдений. М., Физматгиз, 1958 г., 334 стр. / Linnik Yu.V. The method of least squares and the foundations of the mathematical-statistical theory of observation processing. M., Fizmatgiz, 1958, 334 p. (in Russian).

24. Рао Д., Макмахан Б. Знакомство с PyTorch: глубокое обучение при обработке естественного языка, Пер. с англ. Питер, 2020 г., 265 стр. / Rao D., McMahan B. Natural Language Processing with PyTorch: Build Intelligent Language Applications Using Deep Learning. O'Reilly Media, 2019, 256 p.

25. LeCun Y., Cortes C., Burges C.J.C. THE MNIST DATABASE of handwritten digits. Available at http://yann.lecun.com/exdb/mnist/, accessed 10.02.2020.

26. PyTorch. Available at https://pytorch.org/, accessed 10.11.2019.


Review

For citations:


VERSHKOV N.A., BABENKO M.G., KUCHUKOV V.A., KUCHUKOVA N.N. Advanced supervised learning in multi-layer perceptrons to the recognition tasks based on correlation indicator. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2021;33(1):33-46. (In Russ.) https://doi.org/10.15514/ISPRAS-2021-33(1)-2



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)