SQLite RDBMS Extension for Data Indexing Using B-tree Modifications
https://doi.org/10.15514/ISPRAS-2019-31(3)-16
Abstract
Multiway trees are one of the most popular solutions for the big data indexing. The most commonly used kind of the multiway trees is the B-tree. There exist different modifications of the B-trees, including B+-trees, B*-trees and B*+-trees considered in this work. However, these modifications are not supported by the popular open-source relational DBMS SQLite. This work is based on the previous research on the performance of multiway trees in the problem of structured data indexing, with the previously developed multiway trees C++ library usage. In this research the B*+-tree was developed as the data structure which combines the main B+-tree and B*-tree features together. Also, in the research the empirical computational complexities of different operations on the B-tree and its modifications were measured as well as the memory usage. The purpose of the current work is the development of the SQLite RDBMS extension which allows to use B-tree modifications (B+-tree, B*-tree and B*+-tree) as index structures in the SQLite RDBMS. The modifications of the base data structure were developed as a C++ library. The library is connected to the SQLite using the C-C++ cross-language API which is developed in the current work. The SQLite extension implements the novel algorithm for selecting the index structure (one of B-tree’s modifications) for some table of a database. The provided SQLite extension is adopted by the SQLite EventLog component of the LDOPA process mining library. In addition, the experiment on the counting the empirical computational complexities of operations on the trees of different types is conducted using the developed in this work SQLite extension.
About the Authors
Anton Mikhailovitch RiginRussian Federation
Sergey Andreevich Shershakov
Russian Federation
Researcher at PAIS Lab of the Faculty of Computer Science
References
1. . Manyika J., Chui M., Brown B., Bughin J., Dobbs R., Roxburgh C., Hung Byers A. Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute, May 2011. Available at: https://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%20Digital/Our%20Insights/Big%20data%20The%20next%20frontier%20for%20innovation/MGI_big_data_exec_summary.ashx, accessed Jan. 20, 2019.
2. . Bayer R., McCreight E. Organization and maintenance of large ordered indexes. Acta Informatica, vol. 1, no. 3, 1972, pp. 173 – 189.
3. . Pollari-Malmi K. B+-trees. Available at: https://www.cs.helsinki.fi/u/mluukkai/tirak2010/B-tree.pdf, accessed Dec. 24, 2018.
4. . B*-tree. NIST Dictionary of Algorithms and Data Structures. Available at: https://xlinux.nist.gov/dads/HTML/bstartree.html, accessed Dec. 24, 2018.
5. . Rigin A.M. On the Performance of Multiway Trees in the Problem of Structured Data Indexing. Coursework, Dept. Soft. Eng., HSE, Moscow, Russia, 2018 (in Russian) / Ригин В.М. Исследование эффективности сильно ветвящихся деревьев в задаче индексирования структурированных данных. Курсовая работа, Департамент программной инженерии, ФКН, ВШЭ, Москва, 2018.
6. . SQLite Home Page. Available at: https://www.sqlite.org/, accessed Jan. 20, 2019.
7. . Library for Dynamic Operational Process Analysis (LDOPA). xiart.ru Projects. Available at: https://prj.xiart.ru/projects/ldopa, accessed Jul. 1, 2019.
8. . SQLiteStudio. Available at: https://sqlitestudio.pl/, accessed Jan. 26, 2019.
Review
For citations:
Rigin A.M., Shershakov S.A. SQLite RDBMS Extension for Data Indexing Using B-tree Modifications. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2019;31(3):203-216. https://doi.org/10.15514/ISPRAS-2019-31(3)-16