Analysis and development tools for efficient programs on parallel architectures

Alexander Monakov; Eugene Velesevich; Vladimir Platonov; Arutyun Avetisyan

doi:10.15514/ISPRAS-2014-26(1)-14

Analysis and development tools for efficient programs on parallel architectures

Alexander Monakov, Eugene Velesevich, Vladimir Platonov, Arutyun Avetisyan

https://doi.org/10.15514/ISPRAS-2014-26(1)-14

Full Text:

PDF (Rus)

Generate QR code

Abstract

The article proposes methods for supporting development of efficient programs for modern parallel architectures, including hybrid systems. Specialized profiling methods designed for programmers tasked with parallelizing existing code are proposed. The problem of automatic parallel code generation for hybrid architectures is discussed. In cases where achieving high efficiency on hybrid systems requires significant rework of data structures or algorithms, one can employ auto-tuning to specialize for specific input data and hardware at run time. This is demonstrated on the problem of optimizing sparse matrix-vector multiplication for GPUs and its use for accelerating linear system solving in OpenFOAM CFD package.

Keywords

software optimization, profiling, sparse matrices, OpenFOAM

About the Authors

Alexander Monakov

Institute for System Programming of RAS
Russian Federation

Eugene Velesevich

Institute for System Programming of RAS
Russian Federation

Vladimir Platonov

Institute for System Programming of RAS
Russian Federation

Arutyun Avetisyan

Institute for System Programming of RAS
Russian Federation

References

1. NVIDIA. CUDA Programming Guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide

2. Khronos Group. OpenCL. http://www.khronos.org/opencl/

3. Coccinelle: A Program Matching and Transformation Tool for Systems Code. http://coccinelle.lip6.fr/

4. E. Berg, H. Zeffer, E. Hagersten. A Statistical Multiprocessor Cache Model. In Proceedings of the 2006 IEEE International Symposium on Performance Analysis of System and Software, Austin, Texas, USA, March 2006.

5. A. Belevantsev, A. Kravets, A. Monakov. Avtomaticheskaya generaciya OpenCL-koda iz gnyozd ciklov s pomoshhyu polie`dral'noy modeli [Automatically generating OpenCL code from loop nests via a polyhedral model]. Trudy ISP RAN [The Proceedings of ISP RAS], volume 21, p. 5-22, 2011. (In Russian)

6. A. Kravets, A. Monakov, A. Belevantsev: GRAPHITE-OpenCL: Generate OpenCL Code from Parallel Loops. In Proceedings of the GCC Developers' Summit: 9-18, Ottawa, October 2010

7. A. Monakov, A. Lokhmotov, A. Avetisyan: Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures. In HiPEAC 2010: 111-125, Italy, January 2010

8. A. Monakov, A. Avetisyan: Specialized Sparse Matrix Formats and SpMV Kernel Tuning for GPUs. In GPU Technology Conference 2012, USA, May 2012

9. A. Monakov, V. Platonov: Accelerating OpenFOAM with Parallel GPU Linear Solver. In 8th OpenFOAM Workshop, South Korea, June 2013

10. A. Monakov. Optimizaciya raschyotov v pakete OpenFOAM na GPU [On Optimizing OpenFOAM GPU solvers]. Trudy ISP RAN [The Proceedings of ISP RAS], volume 22, p. 223-232, 2012. DOI: 10.15514/ISPRAS-2012-22-14. (In Russian)

Review

For citations:

Monakov A., Velesevich E., Platonov V., Avetisyan A. Analysis and development tools for efficient programs on parallel architectures. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2014;26(1):357-374. (In Russ.) https://doi.org/10.15514/ISPRAS-2014-26(1)-14

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Analysis and development tools for efficient programs on parallel architectures

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy