Real Application of CNN Interpretation Methods: Document Image Classification Model Errors’ Detection and Validation
https://doi.org/10.15514/ISPRAS-2023-35(2)-1
Abstract
In this paper, we consider the case of applying convolutional neural networks interpretation methods to ResNet 18 model in order to identify and justify model errors. The model is used in the problem of classifying the orientation of text documents images. First, using interpretation methods, an assumption was made as to why the neural network shows low metrics on data that differs from training images. The alleged reason was the presence of artifacts on the generated training images, caused by the use of an image rotation function. Further, using the Vanilla Gradient, Guided Backpropagation, Integrated Gradients, GradCAM methods and the invented metric, we managed to accurately confirm the hypothesis put forward. The obtained results helped to significantly improve the accuracy of the model.
About the Authors
Alexander Olegovich GOLODKOVRussian Federation
Graduate of the Moscow Institute of Physics and Technology, senior laboratory assistant
Oksana Vladimirovna BELYAEVA
Russian Federation
PhD student, Researcher
Andrey Igorevich PERMINOV
Russian Federation
PhD student, Researcher
References
1. Wang J., Yang Y. et al. CNN-RNN: A Unified Framework for Multi-label Image Classification. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2285-2294.
2. Milletari F., Navab N., Ahmadi S.A. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. In Proc. of the Fourth International Conference on 3D Vision (3DV), 2016, pp. 565-571.
3. Xie X., Cheng G. et al. Oriented R-CNN for Object Detection. In Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 3500-3509.
4. He F., Liu T., Tao D. Why ResNet Works? Residuals Generalize. IEEE Transactions on Neural Networks and Learning Systems, vol. 31, issue 12, 2020, pp. 5349-5362.
5. Buhrmester V., Münch D., Arens M. Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey. Machine Learning and Knowledge Extraction, vol. 3, issue 4, 2021, pp. 966-989.
6. Li G., Yu Y. Visual Saliency Detection Based on Multiscale Deep CNN Features. IEEE Transactions on Image Processing, vol. 25, issue 11, 2016, pp. 5012-5024.
7. Barredo-Arrieta A., Díaz-Rodríguez N. et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information Fusion, vol. 58, 2020, pp. 82-115.
8. Simonyan K., Vedaldi A., Zisserman A. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arXiv preprint arXiv:1312.6034, 2013, 8 p.
9. Springenberg J.T., Dosovitskiy A. et al. Striving for Simplicity: The All Convolutional Net. arXiv preprint arXiv:1412.6806, 2014, 14 p.
10. Sundararajan M., Taly A., Yan Q. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning, 2017, pp. 3319-3328.
11. Selvaraju R.R., Cogswell M. et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proc. of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618-626.
12. Kapishnikov A., Bolukbasi T. et al. XRAI: Better Attributions Through Regions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4948-4957.
13. Olah C., Mordvintsev A., Schubert L. Feature Visualization, 2017. Available at: https://distill.pub/2017/feature-visualization/?ref=hackernoon.com, accessed May 18, 2023.
14. Desai S., Ramaswamy H.G. Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization. In Proc. of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 972-980.
Review
For citations:
GOLODKOV A.O., BELYAEVA O.V., PERMINOV A.I. Real Application of CNN Interpretation Methods: Document Image Classification Model Errors’ Detection and Validation. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2023;35(2):7-18. https://doi.org/10.15514/ISPRAS-2023-35(2)-1