Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Dynamic compilation of expressions in SQL queries for PostgreSQL

https://doi.org/10.15514/ISPRAS-2016-28(4)-13

Abstract

In recent years, as performance and capacity of main and external memory grow, performance of database management systems (DBMSes) on certain kinds of queries is more determined by raw CPU speed. Currently, PostgreSQL uses the interpreter to execute SQL queries. This yields an overhead caused by indirect calls to handler functions and runtime checks, which could be avoided if the query were compiled into native code "on-the-fly", i.e. just-in-time (JIT) compiled: at run time the specific table structure is known as well as data types and built-in functions used in the query as well as the query itself. This is especially important for complex queries, performance of which is CPU-bound. We’ve developed a PostgreSQL extension that implements SQL query JIT compilation using LLVM compiler infrastructure. In this paper we show how to implement JIT compilation to speed up sequential scan operator (SeqScan) as well as expressions in WHERE clauses. We describe some important optimizations that are possible only with dynamic compilation, such as precomputing tuple attributes offsets only for attributes used by the query. We also discuss the maintainability of our extension, i.e. the automation for translating PostgreSQL backend functions into LLVM IR, using the same source code both for our JIT compiler and the existing interpreter. Currently, with LLVM JIT we achieve up to 5x speedup on synthetic tests as compared to original PostgreSQL interpreter.

About the Authors

E. Y. Sharygin
Institute for System Programming of the Russian Academy of Sciences
Russian Federation


R. A. Buchatskiy
Institute for System Programming of the Russian Academy of Sciences
Russian Federation


L. V. Skvortsov
Lomonosov Moscow State University, CMC Department
Russian Federation


R. A. Zhuykov
Institute for System Programming of the Russian Academy of Sciences
Russian Federation


D. M. Melnik
Institute for System Programming of the Russian Academy of Sciences
Russian Federation


References

1. Graefe G. Volcano - an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng.,6(1): 120-135, 1994.

2. PostgreSQL, an open source object-relational database system. https://www.postgresql.org

3. Lattner C. LLVM: An Infrastructure for Multi-Stage Optimization. Master’s thesis, Computer Science Dept., University of Illinois at Urbana-Champaign, Urbana, IL.

4. Neumann T. Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. , vol. 4, no. 9, pp. 539-550, Jun. 2011.

5. Neumann T., Leis V. Compiling Database Queries into Machine Code. IEEE Data Engineering Bulletin, March 2014.

6. Zhang R., Debray S., Snodgrass R.T. Micro-specialization: dynamic code specialization of database management systems. In Code Generation and Optimization, pages 73, 2012.

7. Tan CK. Vitesse DB: 100% Postgres, 100X Faster For Analytics. https://docs.google.com/presentation/d/1R0po7_Wa9fym5U9Y5qHXGlUi77nSda2LlZXPuAxtd-M/pub

8. TPC-H, an ad-hoc, decision support benchmark. http://www.tpc.org/tpch/

9. Kornacker M., Behm A. et al. Impala: A Modern, Open-Source SQL Engine for Hadoop. CIDR, 2015.

10. Armbrust M., Xin R.S. et al. Spark SQL: Relational Data Processing in Spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 1383-1394, 2015.

11. Julia, a high-level, high-performance dynamic programming language for technical computing. http://julialang.org/

12. FTL JIT, JavaScriptCore's top-tier optimizing compiler. https://trac.webkit.org/wiki/FTLJIT

13. Vardanyan V., Ivanishin V., Asryan S., Khachatryan A., Hakobyan J. Dynamic Compilation of JavaScript Programs to the Statically Typed LLVM Intermediate Representation. Trudy ISP RАN / Proc. ISP RAS], vol. 27, issue 6, 2015. pp. 33-48 (in Russian). DOI: 10.15514/ISPRAS-2015-27(6)-3

14. bbPyston, an open-source Python implementation using JIT techniques. https://pyston.org

15. MacRuby, an implementation of Ruby 1.9 directly on top of Mac OS X core technologies such as the Objective-C runtime and garbage collector, the LLVM compiler infrastructure and the Foundation and ICU frameworks. http://www.macruby.org

16. MCJIT Design and Implementation. http://llvm.org/docs/MCJITDesignAndImplementation.html

17. Momjian B. PostgreSQL Internals through Pictures. http://momjian.us/main/writings/pgsql/internalpics.pdf


Review

For citations:


Sharygin E.Y., Buchatskiy R.A., Skvortsov L.V., Zhuykov R.A., Melnik D.M. Dynamic compilation of expressions in SQL queries for PostgreSQL. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2016;28(4):217-240. (In Russ.) https://doi.org/10.15514/ISPRAS-2016-28(4)-13



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)