A debugger of parallel programs for OS Linux
https://doi.org/10.15514/ISPRAS-2020-32(4)-7
Abstract
The paper presents a debugger for parallel programs in С/C++, or FORTRAN, which are executed in high-performance computers. The debugger’s program components and mechanism of their interaction are described. The graphic user’s interface capabilities are discussed and the profiling procedure using built-in profiling tools is described. The paper contains of the description of the new parallel debugger capabilities such as a communication treelike scheme of his components connection, and a non-interactive debugging mode, and the support of Nvidia’s graphic accelerators. Currently, the debugger provides launching of debug jobs in the systems of batch processing of jobs such as Open PBS / Torque, SLURM, and CSP JAM but it can be configured for other systems. The PD debugger allows to debug program processes and threads, manage breakpoints and watchpoints, logically divide program processes into subsets, manage them, change and view variables, and profile the debugged program using the free Google Performance Tools and mpiP. The PD debugger is written in the Java programming language, intended for debugging programs on Unix / Linux operating systems, and it uses free software components such as SwingX, JHDF5, Jzy3D, RSyntaxTextArea, and OpenGL.
About the Authors
Aleksey Borisovich KISELEVRussian Federation
Ph.D in physics and mathematics, head of the laboratory for system software development
Sergey Nikolaevich KISELEV
Russian Federation
senior researcher
References
1. TotalView. URL: https://totalview.io/free-trial, 12.03.2020.
2. Distributed Debugging Tool. URL: https://www.arm.com/products/development-tools/server-and-hpc/forge/ddt, 12.03.2020.
3. Eclipse Parallel Tools Platform. URL: http://eclipse.org/ptp, 12.03.2020.
4. Федоров В.К., Киселев С.Н. Отладчик параллельных приложений (PDB). Вопросы атомной науки и техники. Серия «Математическое моделирование физических процессов», вып. 3, 2013 г., стр. 65-71. // Fedorov V.K, Kiselev S.N. A debugger of parallel applications (PDB). Voprosy Atomnoy Nauki i Tekhniki. Series «Mathematical modelling of physical processes», issue 3, 2013, pp.65-71 (in Russian).
5. Malyshkin V. E, Romanenko A.A. GEPARD – General Parallel Debugger for MVS-1000/M. Lecture Notes in Computer Science, vol. 2763, 2003, pp.519-523.
6. Андрианов А.Н., Базаров С.Б., Бугеря А.Б., Колударов П.И., Набоко И.М. Применение отладчика параллельных программ при решении задачи о фокусировке ударных и взрывных волн на многопроцессорных ЭВМ. Препринты ИПМ им. М. В. Келдыша, № 50, 2004 г., 15 стр. / Andrianov A.N., Bazarov S.B., Bugerya A.B., Koludarov P.I., Naboko I.M. Application of parallel programs debugger for multiCPU modeling of shock and blast waves focusing. Keldysh Institute preprints, № 50, 2004, 15 p.
7. Киселев А.Б., Киселев С.Н., Семенов Г.П. Отладчик параллельных программ для кластеров на базе ОС Linux. Вопросы атомной науки и техники. Серия «Математическое моделирование физических процессов», 2018 г., вып. 2, стр. 72-80. // Kiselev A.B., Kiselev S.N. A parallel debugger for clusters on the base of OS Linux. Voprosy Atomnoy Nauki i Tekhniki. Series «Mathematical modelling of physical processes», issue 2, 2018, pp.72-80 (in Russian).
8. GDB. URL: http://www.gnu.org/software/gdb, 12.03.2020.
9. Torque. URL: http://www.adaptivecomputing.com/products/open-source/torque, 12.03.2020.
10. SLURM. URL: https://slurm.schedmd.com/documentation.html, 12.03.2020.
11. Киселев А.Б, Киселев С.Н. Система пакетной обработки заданий JAM. Вопросы атомной науки и техники. Серия «Математическое моделирование физических процессов», 2009 г., вып. 4, стр. 60-66. / Kiselev A.B., Kiselev S.N. JAM batch jobs system. Voprosy Atomnoy Nauki i Tekhniki. Series «Mathematical modelling of physical processes», issue 4, 2009, pp. 60-66 (in Russian).
12. Google Performance Tools. URL: https://github.com/gperftools/gperftools, 12.03.2020.
13. mpiP: Lightweight, Scalable MPI Profiling. URL: http://mpip.sourceforge.net, 12.03.2020.
14. SwingX. URL: https://github.com/arotenberg/swingx, 12.03.2020.
15. JHDF5. URL: http://www.hdfgroup.org, 12.03.2020.
16. Jzy3D. URL: http://www.jzy3d.org, 12.03.2020.
17. RSyntaxTextArea. URL: http://www.sorceforge.org/rsyntaxtextarea, 12.03.2020.
18. OpenGL. URL: https://www.opengl.org/sdk/libs/, 12.03.2020.
19. CUDA Toolkit Documentation. URL: http://docs.nvidia.com/cuda/eula/index.html, 12.03.2020.
20. Кульнев Д.В., Модянов Р.В., Петрик А.Н. Защищенная ОС. Открытые системы. СУБД, № 4, 2015 г. / Kulnev D.V., Modjanov R. V, Petrik A.N. Secured OS. Open systems. DBMS, № 4. 2015 (in Russian).
21. Петрик А.Н. Защищенная операционная система Арамид для супер-ЭВМ. Сборник тезисов докладов Национального суперкомпьютерного форума (НСКФ-2019), 2919 г. / Petrik A.N. Secured OS Aramid for the super-computer. In Proc. of the National Supercomputer Forum (NSCF-2019), 2019 (in Russian).
Review
For citations:
KISELEV A.B., KISELEV S.N. A debugger of parallel programs for OS Linux. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2020;32(4):97-114. (In Russ.) https://doi.org/10.15514/ISPRAS-2020-32(4)-7