Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Approach to Building AI-Compilers Using the MLIR Framework

https://doi.org/10.15514/ISPRAS-2025-37(1)-5

Abstract

The development of matrix extensions of processor architectures, as well as the implementation of these extensions in specialized AI processors, can significantly improve the efficiency of artificial neural networks. The paper provides an overview of the basic functionality of some popular matrix extensions of processor architectures, in particular, ARM SME, RISC-V IME, RISC-V AME extensions, as well as the DaVinci processor architecture. As a result of the analysis, a model of an abstract matrix processor was proposed. This model reflects the features of modern processor architectures supporting matrix extensions. For the introduced model of the matrix processor, a heterogeneous matrix intermediate representation was developed, which can be used to build compilers for neural networks. The proposed intermediate representation was implemented in the MLIR infrastructure as a heteroMx dialect. The paper also describes an approach to building an AI compiler using the heteroMx dialect. The developed intermediate representation can be adapted or specified for other matrix processor architectures.

About the Authors

Ivan Ivanovich KULAGIN
Ivannikov Institute for System Programming of the Russian Academy of Sciences, Lomonosov Moscow State University
Russian Federation

Cand. Sci. (Tech.), Researcher in ISP RAS. Research interests: compiler construction, compiler optimizations, polyhedral compilation, code generation, parallel programming models, AI-accelerators.



Ruben Arturovich BUCHATSKIY
Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

Cand. Sci. (Tech.), researcher at Compiler Technology department of ISP RAS. Research interests: static analysis, compiler technologies, optimizations.



Mikhail Vyacheslavovich PANTILIMONOV
Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

Researcher at Compiler Technology department of ISP RAS. Research interests: static analysis, compiler technologies, DBMS.



Andrey Viktorovich VYAZOVTSEV
Ivannikov Institute for System Programming of the Russian Academy of Sciences, Moscow Institute of Physics and Technology
Russian Federation

A student at MIPT, laboratory assistant in Compiler Technology department at ISP RAS. Research interests: static analysis, compiler technologies, optimizations.



Mikhail Maksimovich ROMANOV
Ivannikov Institute for System Programming of the Russian Academy of Sciences, Lomonosov Moscow State University
Russian Federation

A student of CMC department of MSU, laboratory assistant in Compiler Technology department at ISP RAS. Research interests: compiler technologies, artificial neural networks acceleration.



Dmitry Mikhailovich MELNIK
Ivannikov Institute for System Programming of the Russian Academy of Sciences, Lomonosov Moscow State University
Russian Federation

Senior Researcher in Compiler Technology department. Research interests: compiler optimizations, dynamic (JIT) compilation.



References

1. Sousa R. et al. Tensor slicing and optimization for multicore NPUs //Journal of Parallel and Distributed Computing. – 2023. – Т. 175. – С. 66-79.

2. Jouppi N. P. et al. In-datacenter performance analysis of a tensor processing unit //Proceedings of the 44th annual international symposium on computer architecture. – 2017. – С. 1-12.

3. Lattner C. et al. MLIR: Scaling compiler infrastructure for domain specific computation //2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). – IEEE, 2021. – С. 2-14.

4. Remke S., Breuer A. Hello SME! Generating Fast Matrix Multiplication Kernels Using the Scalable Matrix Extension //arXiv preprint arXiv:2409.18779. – 2024.

5. Stephens N. et al. The ARM scalable vector extension //IEEE micro. – 2017. – Т. 37. – №. 2. – С. 26-39.

6. The RISC-V IME Set Specification. https://github.com/space-mit/riscv-ime-extension-spec/releases/download/v0429/spacemit-ime-asciidoc.pdf.

7. H. Liao, J. Tu, J. Xia and X. Zhou, "DaVinci: A Scalable Architecture for Neural Network Computing," 2019 IEEE Hot Chips 31 Symposium (HCS), Cupertino, CA, USA, 2019, pp. 1-44, doi: 10.1109/HOTCHIPS.2019.8875654.

8. TOSA specification. https://www.mlplatform.org/tosa/tosa_spec.html, Accessed July 2023.

9. Goto K., Geijn R. A. Anatomy of high-performance matrix multiplication //ACM Transactions on Mathematical Software (TOMS). – 2008. – Т. 34. – №. 3. – С. 1-25.


Review

For citations:


KULAGIN I.I., BUCHATSKIY R.A., PANTILIMONOV M.V., VYAZOVTSEV A.V., ROMANOV M.M., MELNIK D.M. Approach to Building AI-Compilers Using the MLIR Framework. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(1):87-106. (In Russ.) https://doi.org/10.15514/ISPRAS-2025-37(1)-5



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)