Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Link-time optimization speedup

https://doi.org/10.15514/ISPRAS-2016-28(5)-11

Abstract

This paper proposes the two different approaches to speed-up program build: making link-time optimization work in parallel and lightweight optimization approach. The former technique is achieved by scaling LTO system. The latter makes link to work faster because of using summaries to manage some of interprocedural optimization passes instead of full IR code in memory. The problem of horizontal LTO system scaling leads to the problem of partition of the large task to several subtasks that can be performed concurrently. The problem is complicated by the compiler pipeline model: interprocedural optimization passes works consequentially and depends on previous performed ones. That means we can divide just data on which passes works, not passes themselves. We need to divide IR code to sub-independent parts and run LTO on them in parallel. We use program call graph analysis to divide a program to parts. Therefore, our goal is to divide call graph that is one of NP-compete problems. Nevertheless, the choice of the dividing algorithm strongly depends on properties of divided graph. The main goal of our investigation is to find lightweight graph partition algorithm that works efficiently on call graphs of real programs and that does not spoil LTO performance achievements after optimizing code pieces separately. This paper proposes new graph partition algorithm for program call graphs, results of comparing this one with two other methods on SPEC CPU2000 benchmark and implementation of the algorithm in scalable LLVM-based LTO system. The implementation of this approach in LTO system shows 31% link speedup and 3% of performance degradation for 4 threads. The lightweight optimization shows 0,5% speedup for single run in lazy code loading mode.

About the Authors

K. . Dolgorukova
Institute for System Programming of the Russian Academy of Sciences
Russian Federation


S. . Arishin
Institute for System Programming of the Russian Academy of Sciences
Russian Federation


References

1. Andrew Ayers, Richard Schooler, and Robert Gottlieb. Aggressive inlining. SIGPLAN Not., 32(5):134–145, 1997. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/258916.258928.

2. Honza Hubicka. Linktime optimization in GCC, part 2 – Firefox.

3. http://hubicka.blogspot.ru/2014/04/linktime-optimization-in-gcc-2-firefox.html.

4. Honza Hubicka. Linktime optimization in GCC, part 3 – LibreOffice.

5. http://hubicka.blogspot.ru/2014/09/linktime-optimization-in-gcc-part-3.html.

6. K. Dolgorukova. [Overview of Scalable Frameworks of Cross-Module Optimization]. Trudy ISP RAN/Proc. ISP RAS, vol. 26, issue 3, 2014, pp. 69-90 (in Russian). DOI: 10.15514/ISPRAS-2014-26(3)-3.

7. K. Dolgorukova. [Implementation of Memory Scalability Approach for LLVM-Based Link-Time Optimization and Static Analyzing Systems]. Trudy ISP RAN/Proc. ISP RAS, vol. 27, issue 6, 2015, pp. 173-192 (in Russian). DOI: 10.15514/ISPRAS-2015-27(6)-7.

8. Satu Elisa Schaeffer. Graph clustering. Survey. Computer Science Review. Volume 1, Issue 1, August 2007, Pages 27–64. DOI:10.1016/j.cosrev.2007.05.001.

9. M. Girvan, M. E. J. Newman. Community structure in social biological networks. PNAS, June 11, 2002, vol.99, no.12. DOI: 10.1073/pnas.122653799.

10. M. E. J. Newman. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Physical review E, vol. 64, 016132. DOI: 10.1103/PhysRevE.64.016132.

11. L. da F. Costa, F. A. Rodrigues, G. Travieso & P. R. Villas Boas (2007). Characterization of complex networks: A survey of measurements, Advances in Physics, 56:1, 167-242. DOI:10.1080/00018730601170527

12. M. E. J. Newman, M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004). arXiv:cond-mat/0308217 DOI:10.1103/PhysRevE.69.026113.

13. L. Danon, A. Díaz Guilera, J. Duch, A. Arenas, Comparing community structure identification, Journal of Statistical Mechanics Theory and Experiment (2005) P09008.

14. D. A. Bader, H. Meyerhenke, P. Sanders, D. Wagner. Contemporary mathematics: graph partitioning and graph clustering. 10th DIMACS Implementation Challenge Workshop, February 13–14, 2012, Georgia Institute of Technology, Atlanta, GA.

15. Preston Briggs, Doug Evans, Brian Grant, Robert Hundt, William Maddox, Diego Novillo, Seongbae Park, David Sehr, Ian Taylor, Ollie Wild. WHOPR - Fast and Scalable Whole Program Optimizations in GCC. Initial Draft, 12 Dec 2007.

16. Gautam Chakrabarti, Fred Chow. Structure Layout Optimizations in the Open64 Compiler: Design, Implementation and Measurements. Open64 Workshop at CGO 2008, April 6, 2008. Boston, Massachusetts.

17. SPEC CPU benchmark. https://www.spec.org/cpu2000


Review

For citations:


Dolgorukova K., Arishin S. Link-time optimization speedup. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2016;28(5):175-198. (In Russ.) https://doi.org/10.15514/ISPRAS-2016-28(5)-11



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)