Real Application of CNN Interpretation Methods: Document Image Classification Model Errors’ Detection and Validation

Alexander Olegovich GOLODKOV; Oksana Vladimirovna BELYAEVA; Andrey Igorevich PERMINOV

doi:10.15514/ISPRAS-2023-35(2)-1

Real Application of CNN Interpretation Methods: Document Image Classification Model Errors’ Detection and Validation

Alexander Olegovich GOLODKOV, Oksana Vladimirovna BELYAEVA, Andrey Igorevich PERMINOV

https://doi.org/10.15514/ISPRAS-2023-35(2)-1

Full Text:

PDF (Eng)

Generate QR code

Abstract

In this paper, we consider the case of applying convolutional neural networks interpretation methods to ResNet 18 model in order to identify and justify model errors. The model is used in the problem of classifying the orientation of text documents images. First, using interpretation methods, an assumption was made as to why the neural network shows low metrics on data that differs from training images. The alleged reason was the presence of artifacts on the generated training images, caused by the use of an image rotation function. Further, using the Vanilla Gradient, Guided Backpropagation, Integrated Gradients, GradCAM methods and the invented metric, we managed to accurately confirm the hypothesis put forward. The obtained results helped to significantly improve the accuracy of the model.

Keywords

CNN Interpretation, Document Image Classification, Document Orientation Detection

About the Authors

Alexander Olegovich GOLODKOV

Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

Graduate of the Moscow Institute of Physics and Technology, senior laboratory assistant

Oksana Vladimirovna BELYAEVA

Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

PhD student, Researcher

Andrey Igorevich PERMINOV

Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

PhD student, Researcher

References

1. Wang J., Yang Y. et al. CNN-RNN: A Unified Framework for Multi-label Image Classification. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2285-2294.

2. Milletari F., Navab N., Ahmadi S.A. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. In Proc. of the Fourth International Conference on 3D Vision (3DV), 2016, pp. 565-571.

3. Xie X., Cheng G. et al. Oriented R-CNN for Object Detection. In Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 3500-3509.

4. He F., Liu T., Tao D. Why ResNet Works? Residuals Generalize. IEEE Transactions on Neural Networks and Learning Systems, vol. 31, issue 12, 2020, pp. 5349-5362.

5. Buhrmester V., Münch D., Arens M. Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey. Machine Learning and Knowledge Extraction, vol. 3, issue 4, 2021, pp. 966-989.

6. Li G., Yu Y. Visual Saliency Detection Based on Multiscale Deep CNN Features. IEEE Transactions on Image Processing, vol. 25, issue 11, 2016, pp. 5012-5024.

7. Barredo-Arrieta A., Díaz-Rodríguez N. et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information Fusion, vol. 58, 2020, pp. 82-115.

8. Simonyan K., Vedaldi A., Zisserman A. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arXiv preprint arXiv:1312.6034, 2013, 8 p.

9. Springenberg J.T., Dosovitskiy A. et al. Striving for Simplicity: The All Convolutional Net. arXiv preprint arXiv:1412.6806, 2014, 14 p.

10. Sundararajan M., Taly A., Yan Q. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning, 2017, pp. 3319-3328.

11. Selvaraju R.R., Cogswell M. et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proc. of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618-626.

12. Kapishnikov A., Bolukbasi T. et al. XRAI: Better Attributions Through Regions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4948-4957.

13. Olah C., Mordvintsev A., Schubert L. Feature Visualization, 2017. Available at: https://distill.pub/2017/feature-visualization/?ref=hackernoon.com, accessed May 18, 2023.

14. Desai S., Ramaswamy H.G. Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization. In Proc. of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 972-980.

Review

For citations:

GOLODKOV A.O., BELYAEVA O.V., PERMINOV A.I. Real Application of CNN Interpretation Methods: Document Image Classification Model Errors’ Detection and Validation. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2023;35(2):7-18. https://doi.org/10.15514/ISPRAS-2023-35(2)-1

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Real Application of CNN Interpretation Methods: Document Image Classification Model Errors’ Detection and Validation

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy