Devirtualization-Based Python Static Analysis
https://doi.org/10.15514/ISPRAS-2025-37(6)-39
Abstract
In this paper we present an approach to static analysis of Python programs based on a low-level intermediate representation and devirtualization to provide interprocedural and intermodule analysis. This approach can be used to analyze Python programs without type annotations and find complex defects inaccessible to traditional AST-based analysis tools. Using CPython bytecode as a base, the representation suitable to static analysis is constructed and call resolution is performed via an interprocedural devirtualization algorithm. We implemented the proposed approach in a static analyzer for finding errors in C, C++, Java, and Go programs and achieved good results on open-source projects with minimal modifications to existing detectors. The detectors that are relevant to Python had a true positive rate from 60% up to 96%. This demonstrates that our approach allows to apply techniques used for analysis of statically typed languages to Python.
About the Authors
Artemiy Lvovich GALUSTOVRussian Federation
Masters graduate, researcher at ISP RAS. Research interests: static analysis for finding errors in source code.
Konstantin Igorevich VIHLYANTSEV
Russian Federation
Master’s student at the Moscow Institute of Physics and Technology (Faculty of Radio Engineering and Cybernetics). His research interests include static analysis and profiling of dynamic programming languages.
Alexey Evgenevich BORODIN
Russian Federation
Cand. Sci. (Phys.-Math.), researcher. Research interests: static analysis for finding errors in source code.
Andrey Andreevich BELEVANTSEV
Russian Federation
Dr. Sci. (Phys.-Math.), Prof., corresponding Member RAS, leading researcher at ISP RAS, Professor at Moscow State University. Research interests: static analysis, program optimization, parallel programming.
References
1. Tiobe index. Available at: https://www.tiobe.com/tiobe-index/ (accessed 23.02.2025).
2. B. Ray, D. Posnett, V. Filkov, and P. Devanbu. A large scale study of programming languages and code quality in github. In Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, pp. 155-165, 2014.
3. Pylint documentation. Available at: https://pylint.readthedocs.io/en/latest/ (accessed 03.04.2025).
4. Bandit documentation. Available at: https://bandit.readthedocs.io/en/latest/ (accessed 03.04.2025).
5. Mypy - Optional Static Typing for Python. Available at: https://mypy-lang.org/ (accessed 03.04.2025).
6. Quickstart | Pyre – pyre-check.org. Available at: https://pyre-check.org/docs/pysa-quickstart/ (accessed 03.04.2025).
7. I. Rak-Amnouykit, D. McCrevan, A. Milanova, M. Hirzel, and J. Dolby. Python 3 types in the wild: a tale of two type systems. In Proceedings of the 16th ACM SIGPLAN International Symposium on Dynamic Languages, pp. 57-70, 2020.
8. Ivannikov V., Belevantsev A., Borodin A., Ignatiev V., Zhurikhin D., and Avetisyan A. Static analyzer svace for finding defects in a source program code. Programming and Computer Software, 40(5). pp. 265 275, 2014.
9. A. Borodin and I. Dudina. Intraprocedural Analysis Based on Symbolic Execution for Bug Detection. Programming and Computer Software, 47(8), pp. 858-865, 2021.
10. Dis – disassembler for python bytecode. Available at: https://docs.python.org/3.12/library/dis.html (accessed 03.04.2025).
11. Афанасьев В.О., Дворцова В.В., and Бородин А.Е. Статический анализатор для языков с обработкой исключений. Труды Института системного программирования РАН, 34(6):7-28, 2022. / Afanasyev V.O., Dvortsova V.V., Borodin A.E. Static analysis for languages with exception handling. Trudy ISP RAN/Proc. ISP RAS, vol. 34, issue 6, 2022. pp. 7-28 (in Russian). DOI: 10.15514/ISPRAS-2022-34(6)-1.
12. The python language reference. Available at: https://docs.python.org/3.12/reference/ (accessed 03.04.2025).
13. A. Galustov, A. Borodin, and A. Belevantsev. Devirtualization for static analysis with low level intermediate representation. In 2022 Ivannikov Ispras Open Conference (ISPRAS), pp. 18-23. IEEE, 2022.
14. G. Van Rossum et al. Python programming language. In USENIX annual technical conference, volume 41 of number 1, pp. 1–36. Santa Clara, CA, 2007.
Review
For citations:
GALUSTOV A.L., VIHLYANTSEV K.I., BORODIN A.E., BELEVANTSEV A.A. Devirtualization-Based Python Static Analysis. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(6):109-120. https://doi.org/10.15514/ISPRAS-2025-37(6)-39






