Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Optimizations in Dynamic Binary Translation

Abstract

We suggest using OpenCL standard for programming FPGA devices that are used as accelerators in a heterogeneous system. We describe the implementation of a subset of OpenCL that is required for organizing data exchange and task management for FPGAs given that CPU and FPGA are connected via PCI-express bus. Basically, the first part of the required functions is the simple device manipulation and FPGA program loading; the latter requires flashing the FPGA via the JTAG interface. The second part is the memory buffer transfer to and from the FPGA. Its implementation in the runtime library is straightforward given that the FPGA supports PCI-express exchanges; the main load falls onto the FPGA driver and the FPGA system-level firmware organizing these exchanges. The final part is the FPGA task management that is achieved via the simple task scheduler implemented within the FPGA driver. The code running on FPGA can be created with a hardware description language or generated automatically using one of the known translators, e.g. C-to-Verilog, but it should adhere to the ABI described by the FPGA driver and firmware implementations.

About the Authors

Andrey Belevantsev
ISP RAS
Russian Federation


Alexey Merkulov
ISP RAS
Russian Federation


Vladimir Platonov
ISP RAS
Russian Federation


References

1. Khronos OpenCL Working Group. The OpenCL 1.1 Specification, September 2010. http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf

2. NVIDIA OpenCL JumpStart Guide, April 2009. http://developer.download.nvidia.com/OpenCL/NVIDIA_OpenCL_JumpStart_Guide.pdf

3. Xilinx Virtex-6 Family Overview. Version 2.3, March 2011. http://www.xilinx.com/support/documentation/data_sheets/ds150.pdf

4. A. Belevantsev, A. Kravets, A. Monakov. Аvtomaticheskaya generatsiya OpenCL-koda iz gnezd tsiklov s pomoshh'yu poliehdral'noj modeli. [Automatically generating OpenCL code from loop nests via a polyhedral model] Trudy ISP RАN [The Proceedings of ISP RAS], volume 21, p. 5-22, 2011. (In Russian)

5. Nadav Rotem and Yosi Ben Asher. C to Verilog. Automating circuit design. http://c-to-verilog.com/.

6. C. Lavin, M. Padilla, S. Ghosh, B. Nelson, B. Hutchings, and M. Wirthlin. Using Hard Macros to Reduce FPGA Compilation Time. International Conference on Field Programmable Logic and Applications, IEEE, 2010, pp. 438–44.


Review

For citations:


Belevantsev A., Merkulov A., Platonov V. Optimizations in Dynamic Binary Translation. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2012;22. (In Russ.)



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)