Generation of images with handwritten text in Russian
https://doi.org/10.15514/ISPRAS-2023-35(2)-2
Abstract
Automatic handwriting recognition is an important component in the process of electronic documents analysis, but its solution is still far from ideal. One of the main reasons for the complexity of Russian handwriting recognition is the insufficient amount of data used to train recognition models. Moreover, for the Russian language the problem is more acute and is exacerbated by a large variety of complex handwriting. This paper explores the impact of various methods of generating additional training datasets on the quality of recognition models: the method based on handwritten fonts, the StackMix method of gluing words from symbols, and the use of a generative adversarial network. A font-based method for creating images of handwritten text in Russian has been developed and described in this work. In addition, an algorithm for the formation of a new Cyrillic handwritten font based on the existing images of handwritten characters is proposed. The effectiveness of the developed method was tested using experiments that were carried out on two publicly available Cyrillic datasets using two different recognition models. The results of the experiments showed that the developed method for generating images made it possible to increase the accuracy of handwriting recognition by an average of 6%, which is comparable to the results of other more complex methods. The source code of the experiments, the proposed method, as well as the datasets generated during the experiments are posted in the public domain and are ready for download.
About the Authors
Anastasiya Olegovna BOGATENKOVARussian Federation
Master’s student of the Department of System Programming
Oksana Vladimirovna BELYAEVA
Russian Federation
PhD student, Researcher
Andrey Igorevich PERMINOV
Russian Federation
PhD student, Researcher
References
1. Abdallah A., Hamada M., Nurseitov D. Attention-Based Fully Gated CNN-BGRU for Russian Handwritten Text. Journal of Imaging, vol. 6, issue 12, 2020, article no. 141, 23 p.
2. Shonenkov A., Karachev D. et al. StackMix and Blot Augmentations for Handwritten Text Recognition. arXiv preprint arXiv:2108.11667, 2021, 10 p.
3. Fogel S., Averbuch-Elor H. et al. ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 4324-4333.
4. Cyrillic Handwriting Dataset. Available at: https://www.kaggle.com/datasets/constantinwerner/cyrillic-handwriting-dataset, accessed 02.05.2023.
5. Nurseitov D., Bostanbekov K. et al. Handwritten Kazakh and Russian (HKR) database for text recognition. Multimedia Tools and Applications, vol. 80, issue 21-23, 2021, pp. 33075 - 33097.
6. Левенштейн В.И. Двоичные коды с исправлением выпадений, вставок и замещений символов. Доклады Академии наук СССР, том 163, ном. 4, 1965, стр. 845-848 / Levenshtein V.I. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, vol. 10, no. 8, 1966, pp. 707-710.
7. Krishnan P., Jawahar C.V. Generating Synthetic Data for Text Recognition. arXiv preprint arXiv:1608.04224, 2016, 5p.
8. Goodfellow I., Pouget-Abadie J. et al. Generative adversarial networks. Communications of the ACM, vol. 63, issue 11, 2020, pp. 139-144.
9. Kang L., Riba P. et al. GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images. Lecture Notes in Computer Science, vol. 12368, 2020, pp. 273-289.
10. Krishnan P., Kovvuri R. et al. TextStyleBrush: Transfer of Text Aesthetics from a Single Example. IEEE Transactions on Pattern Analysis and Machine Intelligence (Early Access), 2023, 12 p.
11. Calligraphr. Available at: https://www.calligraphr.com, accessed 02.05.2023.
12. База сегментированных рукописных символов / Segmented Handwriting Character Base. Available at: https://drive.google.com/folderview?id=0B0EQUc5HmgcGS0l2RDlKenlpNnc&usp=sharing, accessed 02.05.2023 (in Russian).
13. Sueiras J. Continuous Offline Handwriting Recognition using Deep Learning Models. arXiv preprint arXiv:2112.13328, 2021, 210 p.
14. Kass D. Vats E. AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks. Lecture Notes in Computer Science, vol. 13237, 2022, pp. 507-522.
15. Sutskever I., Vinyals O., Le Q.V. Sequence to sequence learning with neural networks. In Proc. of the 27th International Conference on Neural Information Processing Systems, vol. 2, 2014, pp. 3104-3112.
16. He K., Zhang X. et al. Deep Residual Learning for Image Recognition. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
17. Hochreiter S., Long Short-term Memory, Neural computation, vol. 9, issue. 8, 1997, pp. 1735-1780.
18. Bahdanau D., Cho K., Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv preprint arXiv:1409.0473, 2014, 15 p.
19. Marti U.-V., Bunke H. The IAM-database: an English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition, vol. 5, issue 1, 2002, Pp. 39–46.
20. Timakin V., Afanasyev M. A modern approach to the end-to-end bilingual handwriting text recognition on the example of Russian school notebooks. Available at: https://github.com/t0efL/end2end-HKR-research, accessed 02.05.2023.
21. Liu Z., Mao H. et al. A Convnet for the 2020s. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976–11986.
22. Graves A., Fernández S. et al. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. In Proc. of the 23rd International Conference on Machine Learning, 2006, pp. 369-376.
23. Vaswani A., Shazeer N. et al. Attention is all you need. In Proc. of the 31st Conference on Neural Information Processing System, 2017, pp. 5998-6008.
24. Википедия / Wikipedia. Available at: https://ru.wikipedia.org, accessed 02.05.2023 (in Russian).
Review
For citations:
BOGATENKOVA A.O., BELYAEVA O.V., PERMINOV A.I. Generation of images with handwritten text in Russian. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2023;35(2):19-34. (In Russ.) https://doi.org/10.15514/ISPRAS-2023-35(2)-2