Approach to Building AI-Compilers Using the MLIR Framework
https://doi.org/10.15514/ISPRAS-2025-37(1)-5
Abstract
The development of matrix extensions of processor architectures, as well as the implementation of these extensions in specialized AI processors, can significantly improve the efficiency of artificial neural networks. The paper provides an overview of the basic functionality of some popular matrix extensions of processor architectures, in particular, ARM SME, RISC-V IME, RISC-V AME extensions, as well as the DaVinci processor architecture. As a result of the analysis, a model of an abstract matrix processor was proposed. This model reflects the features of modern processor architectures supporting matrix extensions. For the introduced model of the matrix processor, a heterogeneous matrix intermediate representation was developed, which can be used to build compilers for neural networks. The proposed intermediate representation was implemented in the MLIR infrastructure as a heteroMx dialect. The paper also describes an approach to building an AI compiler using the heteroMx dialect. The developed intermediate representation can be adapted or specified for other matrix processor architectures.
About the Authors
Ivan Ivanovich KULAGINRussian Federation
Cand. Sci. (Tech.), Researcher in ISP RAS. Research interests: compiler construction, compiler optimizations, polyhedral compilation, code generation, parallel programming models, AI-accelerators.
Ruben Arturovich BUCHATSKIY
Russian Federation
Cand. Sci. (Tech.), researcher at Compiler Technology department of ISP RAS. Research interests: static analysis, compiler technologies, optimizations.
Mikhail Vyacheslavovich PANTILIMONOV
Russian Federation
Researcher at Compiler Technology department of ISP RAS. Research interests: static analysis, compiler technologies, DBMS.
Andrey Viktorovich VYAZOVTSEV
Russian Federation
A student at MIPT, laboratory assistant in Compiler Technology department at ISP RAS. Research interests: static analysis, compiler technologies, optimizations.
Mikhail Maksimovich ROMANOV
Russian Federation
A student of CMC department of MSU, laboratory assistant in Compiler Technology department at ISP RAS. Research interests: compiler technologies, artificial neural networks acceleration.
Dmitry Mikhailovich MELNIK
Russian Federation
Senior Researcher in Compiler Technology department. Research interests: compiler optimizations, dynamic (JIT) compilation.
References
1. Sousa R. et al. Tensor slicing and optimization for multicore NPUs //Journal of Parallel and Distributed Computing. – 2023. – Т. 175. – С. 66-79.
2. Jouppi N. P. et al. In-datacenter performance analysis of a tensor processing unit //Proceedings of the 44th annual international symposium on computer architecture. – 2017. – С. 1-12.
3. Lattner C. et al. MLIR: Scaling compiler infrastructure for domain specific computation //2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). – IEEE, 2021. – С. 2-14.
4. Remke S., Breuer A. Hello SME! Generating Fast Matrix Multiplication Kernels Using the Scalable Matrix Extension //arXiv preprint arXiv:2409.18779. – 2024.
5. Stephens N. et al. The ARM scalable vector extension //IEEE micro. – 2017. – Т. 37. – №. 2. – С. 26-39.
6. The RISC-V IME Set Specification. https://github.com/space-mit/riscv-ime-extension-spec/releases/download/v0429/spacemit-ime-asciidoc.pdf.
7. H. Liao, J. Tu, J. Xia and X. Zhou, "DaVinci: A Scalable Architecture for Neural Network Computing," 2019 IEEE Hot Chips 31 Symposium (HCS), Cupertino, CA, USA, 2019, pp. 1-44, doi: 10.1109/HOTCHIPS.2019.8875654.
8. TOSA specification. https://www.mlplatform.org/tosa/tosa_spec.html, Accessed July 2023.
9. Goto K., Geijn R. A. Anatomy of high-performance matrix multiplication //ACM Transactions on Mathematical Software (TOMS). – 2008. – Т. 34. – №. 3. – С. 1-25.
Review
For citations:
KULAGIN I.I., BUCHATSKIY R.A., PANTILIMONOV M.V., VYAZOVTSEV A.V., ROMANOV M.M., MELNIK D.M. Approach to Building AI-Compilers Using the MLIR Framework. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(1):87-106. (In Russ.) https://doi.org/10.15514/ISPRAS-2025-37(1)-5