Active learning and transfer learning for document segmentation
https://doi.org/10.15514/ISPRAS-2021-33(6)-14
Abstract
In this paper, we investigate the effectiveness of classical approaches of active learning in the problem of segmentation of document images in order to reduce the training sample. A modified approach to the selection of images for marking and subsequent training is presented. The results obtained through active learning are compared to transfer learning using fully labeled data. It also investigates how the subject area of the training set, on which the model is initialized for transfer learning, affects the subsequent additional training of the model.
About the Authors
Dmitry Maratovich KIRANOVRussian Federation
MIPT master’s student, laboratory assistant at ISP RAS
Maxim Alexeevitch RYNDIN
Russian Federation
PhD Student
Ilya Sergeevich KOZLOV
Russian Federation
Researcher
References
1. Settles B. Active learning literature survey. Technical Report #1648, University of Wisconsin-Madison, Department of Computer Sciences, 2009, 47 p.
2. Scheffer T., Decomain C., Wrobel S. Active hidden markov models for information extraction. In Proc. of the International Symposium on Intelligent Data Analysis, 2001, pp. 309-318.
3. Dagan I., Engelson S. Committee-based sampling for training probabilistic classifiers. In Proc. of the Twelfth International Conference on Machine Learning, 1995, pp. 150-157.
4. Culotta A., McCallum A. Reducing labeling effort for structured prediction tasks. In Proc. of the 20th National Conference on Artificial Intelligence, 2005, pp. 746-751.
5. Brust C., Käding C., Denzler J. Active Learning for Deep Object Detection. In Proc. of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2019, pp. 181-190.
6. Kao C., Lee T. et al. Localization-Aware Active Learning for Object Detection. In Proc. of the 14th Asian Conference on Computer Vision, 2018, pp. 506-522.
7. Roy S., Unmesh A., Namboodiri V. Deep active learning for object detection. In Proc. of the 29th British Machine Vision Conference, 2018, 12 p.
8. Aghdam H., Gonzalez-Garcia A. et al. Active Learning for Deep Detection Neural Networks. In Proc. of the 17th IEEE/CVF International Conference On Computer Vision, 2019, pp. 3671-3679.
9. Lv X., Duan F. et al. Deep active learning for surface defect detection. Sensors, vol. 20, no. 6, 2020, article no. 1650.
10. Lin T., Maire M. et al. Microsoft COCO: Common Objects in Context. Lecture Notes in Computer Science, vol. 8693, 2014, pp. 740-755.
11. Zhong X., Tang J., Yepes A. PubLayNet: largest dataset ever for document layout analysis. In Proc. of the International Conference on Document Analysis and Recognition (ICDAR), 2019, pp. 1015-1022.
12. Беляева О.В., Перминов А.И., Козлов И.С. Использование синтетических данных для тонкой настройки моделей сегментации документов. Труды ИСП РАН, том 32, вып. 4, 2020 г., стр. 189-202 / Belyaeva O.V., Perminov A.I., Kozlov I.S. Synthetic data usage for document segmentation models fine-tuning. Trudy ISP RAN/Proc. ISP RAS, vol. 32, issue 4, 2020. pp. 189-202 (in Russian). DOI: 10.15514/ISPRAS–2020–32(4)–14
13. Shen Z., Zhao J. et al. OLALA: Object-Level Active Learning for Efficient Document Layout Annotation. arXiv:2010.01762, 2021, 12 p.
14. Ren S., He K. et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proc. of the 28th International Conference on Neural Information Processing Systems, 2015, pp. 91-99.
15. He K., Gkioxari G.et al. Mask R-CNN. In Proc. of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980-2988.
Review
For citations:
KIRANOV D.M., RYNDIN M.A., KOZLOV I.S. Active learning and transfer learning for document segmentation. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2021;33(6):205-216. (In Russ.) https://doi.org/10.15514/ISPRAS-2021-33(6)-14