Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Implementing Apache Spark jobs execution and Apache Spark cluster creation for Openstack Sahara[1]

https://doi.org/10.15514/ISPRAS-2015-27(5)-3

Abstract

In this paper the problem of creating virtual clusters in clouds for big data analysis with Apache Hadoop and Apache Spark is discussed. Existing methods for Apache Spark clusters creation are described in this work. Also the implemented solution for building Apache Spark clusters and Apache Spark jobs execution in Openstack environment is described. The implemented solution is a modification for OpenStack Sahara project and it was featured in Openstack Liberty release.

About the Authors

A. . Aleksiyants
ISP RAS
Russian Federation


O. . Borisenko
ISP RAS
Russian Federation


D. . Turdakov
ISP RAS; CMC MSU; FCS NRU HSE
Russian Federation


A. . Sher
ISP RAS
Russian Federation


S. . Kuznetsov
ISP RAS; CMC MSU; Moscow Institute of Physics and Technology
Russian Federation


References

1. Jeffrey D., Sanjay G. MapReduce: Simplified Data Processing on Large Clusters. OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, 2004.

2. Official Hadoop homepage - http://hadoop.apache.org/

3. Official Infinispan homepage - http://infinispan.org/

4. Official Cloudera CDH Apache Hadoop homepage - http://www.cloudera.com/content/cloudera/en/productsand-services/cdh.html

5. Official BashoRiak homepage - http://basho.com/riak/

6. Official ApacheSpark homepage - http://spark.apache.org/

7. M. Chowdhury, M. Zaharia, I. Stoica. Performance and Scalability of Broadcast in Spark. 2010.

8. VMWare Serengeti page - http://www.vmware.com/hadoop/serengeti

9. Official Cloudera Manager homepage - http://www.cloudera.com/content/cloudera/en/products-andservices/cloudera-enterprise/cloudera-manager.html

10. Buyya R., Broberg J., Goscinski D. Cloud Computing: Principles and Paradigms. Wiley, 2011, 664 P.

11. Buyya R., Yeo C. S., Venugopal S. Market-oriented cloud computing: Vision, hype, and reality for delivering it services as computing utilities. CoRR, (abs/0808.3558), 2008

12. Swift Architectural Overview - http://docs.openstack.org/developer/swift/overview architecture.html

13. Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language - http://www.w3.org/TR/wsdl20/

14. Nurmi, D. The Eucalyptus Open-Source Cloud-Computing System. Cluster Computing and the Grid. 2009. 10.1109/CCGRID.2009.93

15. Nilson J. Hadoop MapReduce in Eucalyptus Private Cloud. Bachelor's Thesis in Computing Science. Umea, Sweden, 2011

16. Official Openstack Heat homepage - https://wiki.openstack.org/wiki/Heat

17. O. Borisenko, D. Turdakov, S. Kuznetsov. Avtomaticheskoe sozdanie virtual'nȳkh klasterov Apache Spark v oblachnoĭ srede Openstack. [Automating cluster creation and management for Apache Spark in Openstack cloud] Trudy ISP RАN [The Proceedings of ISP RAS], volume 26, issue 4, p. 33-43, 2014. (In Russian)

18. Official Amazon Elastic Compute Cloud (EC2) homepage - http://aws.amazon.com/ec2/

19. Creeger, Mache. Cloud Computing: An Overview. ACM Queue 7. 5. 2009

20. OpenStack Sahara roadmap - https://wiki.openstack.org/wiki/Sahara/Roadmap

21. OpenStack Sahara Architecture - http://docs.openstack.org/developer/sahara/architecture.html


Review

For citations:


Aleksiyants A., Borisenko O., Turdakov D., Sher A., Kuznetsov S. Implementing Apache Spark jobs execution and Apache Spark cluster creation for Openstack Sahara[1]. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2015;27(5):35-48. (In Russ.) https://doi.org/10.15514/ISPRAS-2015-27(5)-3



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)