Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Loops software pipelining on ARM platform

Abstract

This article describes improvements we made in the implementation of swing modulo scheduling (SMS), a well-known software pipelining technique, in the GNU Compiler Collection (GCC) for ARM platform. Prior GCC implementation required a loop being pipelined to conform to the do-loop pattern, which needs a special hardware instruction. However, such hardware instruction is absent on ARM. First we implemented a “fake” do-loop instruction in the ARM backend, which helped us to verify whether GCC SMS implementation is profitable on ARM. Then we designed and implemented support for loops which loop counter varies as an arithmetic progression. In do-loops the loop counter must be used only in control part of the loop, and we allow reading loop counter register by other loop instructions. For such loops we improved the algorithm for creating prologue and epilogue as well as implemented much more complex algorithm of verification conditions for entering performance-optimized version of the loop. Also we made necessary changes in data dependency graph to generate correct code. When dependency graph is built we create additional anti-dependencies between instructions which use flag register. The resulting performance improvement is 3-4% for selected test applications on ARM platform. For x86-64 platform, performance results are mostly neutral, with exception of 2-3% improvement on matrix multiplication tests.

About the Authors

Roman Zhuykov
ISP RAS
Russian Federation


Dmitry Melnik
ISP RAS
Russian Federation


Ruben Buchatskiy
ISP RAS
Russian Federation


References

1. Gnu Compiler Collection website. http://gcc.gnu.org/

2. M.R. Garey, D.S. Johnson. “Computers and Intractability: A Guide to the Theory of NP-completeness”. San Francisco: W. H. Freeman & Company Publishers. 1979.

3. A. Belevantsev, D. Zhurikhin, D. Melnik. Kompilyatsiya programm dlya sovremennykh arkhitektur [Program compilation for modern architectures], Trudy ISP RAN [The Proceedings of ISP RAS], 2009, vol. 16, pp. 31-50 (in Russian).

4. J. Llosa, E. Ayguade, A. Gonzalez, M. Valero, J. Eckhardt. “Lifetime-sensitive modulo scheduling in a production environment”. Computers, IEEE Transactions on. Volume 50, Issue 3, pp.234-249. 2001. doi: 10.1109/12.910814

5. B.R. Rau. “Iterative modulo scheduling: An algorithm for software pipelining loops”. In Proc. of the 27th Annual International Symposium on Microarchitecture, pp. 63-74. November 1994. doi: 10.1145/192724.192731

6. Mostafa Hagog and Ayal Zaks. Swing Modulo Scheduling in GCC. In Proceedings of the GCC Developer's Summit 2004, pp 55-64, Ottawa, Canada.

7. Enlightenment Foundation Libraries website http://www.enlightenment.org/p.php?p=about/efl

8. SQLite website http://www.sqlite.org/about.html

9. Standard Performance Evaluation Corporation website. http://www.spec.org/cpu2000/


Review

For citations:


Zhuykov R., Melnik D., Buchatskiy R. Loops software pipelining on ARM platform. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2012;22. (In Russ.)



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)