Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Analysis and development tools for efficient programs on parallel architectures

https://doi.org/10.15514/ISPRAS-2014-26(1)-14

Abstract

The article proposes methods for supporting development of efficient programs for modern parallel architectures, including hybrid systems. Specialized profiling methods designed for programmers tasked with parallelizing existing code are proposed. The problem of automatic parallel code generation for hybrid architectures is discussed. In cases where achieving high efficiency on hybrid systems requires significant rework of data structures or algorithms, one can employ auto-tuning to specialize for specific input data and hardware at run time. This is demonstrated on the problem of optimizing sparse matrix-vector multiplication for GPUs and its use for accelerating linear system solving in OpenFOAM CFD package.

About the Authors

Alexander Monakov
Institute for System Programming of RAS
Russian Federation


Eugene Velesevich
Institute for System Programming of RAS
Russian Federation


Vladimir Platonov
Institute for System Programming of RAS
Russian Federation


Arutyun Avetisyan
Institute for System Programming of RAS
Russian Federation


References

1. NVIDIA. CUDA Programming Guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide

2. Khronos Group. OpenCL. http://www.khronos.org/opencl/

3. Coccinelle: A Program Matching and Transformation Tool for Systems Code. http://coccinelle.lip6.fr/

4. E. Berg, H. Zeffer, E. Hagersten. A Statistical Multiprocessor Cache Model. In Proceedings of the 2006 IEEE International Symposium on Performance Analysis of System and Software, Austin, Texas, USA, March 2006.

5. A. Belevantsev, A. Kravets, A. Monakov. Avtomaticheskaya generaciya OpenCL-koda iz gnyozd ciklov s pomoshhyu polie`dral'noy modeli [Automatically generating OpenCL code from loop nests via a polyhedral model]. Trudy ISP RAN [The Proceedings of ISP RAS], volume 21, p. 5-22, 2011. (In Russian)

6. A. Kravets, A. Monakov, A. Belevantsev: GRAPHITE-OpenCL: Generate OpenCL Code from Parallel Loops. In Proceedings of the GCC Developers' Summit: 9-18, Ottawa, October 2010

7. A. Monakov, A. Lokhmotov, A. Avetisyan: Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures. In HiPEAC 2010: 111-125, Italy, January 2010

8. A. Monakov, A. Avetisyan: Specialized Sparse Matrix Formats and SpMV Kernel Tuning for GPUs. In GPU Technology Conference 2012, USA, May 2012

9. A. Monakov, V. Platonov: Accelerating OpenFOAM with Parallel GPU Linear Solver. In 8th OpenFOAM Workshop, South Korea, June 2013

10. A. Monakov. Optimizaciya raschyotov v pakete OpenFOAM na GPU [On Optimizing OpenFOAM GPU solvers]. Trudy ISP RAN [The Proceedings of ISP RAS], volume 22, p. 223-232, 2012. DOI: 10.15514/ISPRAS-2012-22-14. (In Russian)


Review

For citations:


Monakov A., Velesevich E., Platonov V., Avetisyan A. Analysis and development tools for efficient programs on parallel architectures. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2014;26(1):357-374. (In Russ.) https://doi.org/10.15514/ISPRAS-2014-26(1)-14



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)