Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Topology-aware cloud scheduling for HPC

Abstract

For some compute intensive applications cloud computing can be a costeffective alternative or an addition to supercomputers. However, in the case of highperformance computing, overall application performance depends heavily on how processes are mapped to the network nodes. Therefore a cloud scheduler must be topology-aware to reduce network congestion. In this paper the Hop-Byte metric for the case of "fat tree" network topology was evaluated under the assumption that all pairs of processes communicate evenly. We propose a scheduling algorithm that tries to minimize this metric, which was implemented atop the OpenStack scheduler. All instances are divided into groups according to compute intensive application they belong to. Every time the scheduler receives a request for launching N new instances of the same group, it maps them to the nodes in such a way that entire group (including already running instances) uses as few lower-level switches in “fat tree” as possible. We measured the impact of topology-aware scheduling on the performance of NASA Advanced Supercomputing Parallel Benchmarks. Results of 10 of 11 benchmarks changed insignificantly. The average time of Block Tridiagonal test decreased by 14%, the maximum time decreased by 40% and the difference between the maximum time and the minimum time decreased from 42% to 3%, that is, fluctuation almost disappeared.

About the Authors

I. A. Dudina
ISP RAS, Moscow
Russian Federation


A. O. Kudryavtsev
ISP RAS, Moscow
Russian Federation


S. S. Gaissaryan
ISP RAS, Moscow
Russian Federation


References

1. U.S. Department of Energy. The Magellan Report on Cloud Computing for Science. Chicago, 2011.

2. Kudryavtsev A.O., Koshelev V.K. and Аvetisyan А.I. Perspektivy virtualizatsii vysokoproizvoditel'nykh sistem arkhitektury x64 [The prospects for virtualization of high performance x64 systems] Trudy ISP RАN [The Proceedings of ISP RAS], 2012, vol. 22, pp. 189-210, (in Russian)

3. Kandalla K., Subramoni H., Vishnu A. and Panda D. K. Designing Topology-Aware Collective Communication Algorithms for Large Scale InfiniBand Clusters: Case Studies with Scatter and Gather. The 10th Workshop on Communication Architecture for Clusters (CAC 10), Int’l Parallel and Distributed Processing Symposium (IPDPS 2010), Ohio, 2010.

4. Subramoni H., Kandalla K., Vienne J., Sur S., Barth B., Tomko K., McLay R., Schulz K. and Panda D. K., Design and Evaluation of Network Topology-/Speed- Aware Broadcast Algorithms for InfiniBand Clusters, 2012.

5. Bhatele A., Automatic Topology Aware Mapping For Supercomputers, Graduate College of the University of Illinois at Urbana-Champaign, 2010.

6. Sudheer C. D. and Srinivasan A. Optimization of the Hop-Byte Metric for Effective Topology Aware Mapping. 19th International Conference on High Performance Computing, 2012

7. Gupta A., Milojicic D. and Kalé L. V., Optimizing VM Placement for HPC in the Cloud, Workshop on Cloud Services, Federation, and the 8th Open Cirrus Summit, San Jose, CA, USA., 2012.

8. Filter Scheduler. Nova Developer guide. Accessed: 16 Apr. 2013. http://docs.openstack.org/developer/nova/devref/filter_scheduler.html

9. NAS Parallel Benchmarks. Wikipedia. Accessed: 27 May 2013. http://ru.wikipedia.org/wiki/NAS_Parallel_Benchmarks

10.


Review

For citations:


Dudina I.A., Kudryavtsev A.O., Gaissaryan S.S. Topology-aware cloud scheduling for HPC. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2013;24. (In Russ.)



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)