Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Improving quality of graph partitioning using multilevel optimization

https://doi.org/10.15514/ISPRAS-2014-26(4)-2

Abstract

Graph partitioning is required for solving tasks on graphs that need to be split across disks or computers. This problem is well studied, but most results are not suitable for processing graphs with billons of nodes on commodity clusters, since they require shared memory or low-latency messaging. One approach suitable for cluster computing is Balanced Label Propagation, based on distributed label propagation algorithm for community detection. In this work we show how multi-level optimization can be used to improve partitioning quality of Balanced Label Propagation. One of major difficulties with distributed multi-level optimization is finding a matching in the graph. The matching is needed to choose pairs of vertices for collapsing in order to produce a smaller graph. As this work shows, simply splitting graph into several parts and finding matching in these parts independently is enough to improve the quality of partitioning generated by Balanced Label Propagation. Proposed algorithm can be implemented within any framework that supports MapReduce. In our experiments, when graphs were partitioned into 32 parts, ratio of edges that donтАЩt cross partitions increased from 54-60% to 66-70%. One of significant problems of our implementation is performance тАУ work time of multi-level algorithm was approximately twice that of the original algorithm. It seems likely that implementation can be improved so that multi-level algorithm would achieve better computational performance as well as partitioning quality.

About the Authors

R. Pastukhov
ISP RAS
Russian Federation


A. Korshunov
ISP RAS
Russian Federation


D. Turdakov
ISP RAS
Russian Federation


S. Kuznetsov
ISP RAS
Russian Federation


References

1. M. R. Garey, D. S. Johnson, and L. Stockmeyer. “Some simplified NP-complete graph problems”. In: Theoretical computer science 1.3 (1976), pp. 237–267.

2. G. Karypis and V. Kumar. “A fast and high quality multilevel scheme for partitioning irregular graphs”. In: SIAM Journal on scientific Computing 20.1 (1998), pp. 359–392.

3. J. Ugander and L. Backstrom. “Balanced Label Propagation for Partitioning Massive Graphs”. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining. WSDM ’13. Rome, Italy: ACM, 2013, pp. 507–516. isbn: 978-1-4503-1869-3. doi: 10.1145/2433396.2433461.

4. B. W. Kernighan and S. Lin. “An Efficient Heuristic Procedure for Partitioning Graphs”. In: Bell System Technical Journal 49.2 (1970), pp. 291–307. doi: 10.1002/j.1538-7305.1970.tb01770.x.

5. U. Raghavan, R. Albert, and S. Kumara. “Near linear time algorithm to detect community structures in large-scale networks”. In: Physical Review E 76.3 (2007). doi: 10.1103/physreve.76.036106.

6. J. Dean and S. Ghemawat. “MapReduce: Simplified Data Processing on Large Clusters”. In: OSDI 2004. 2004, pp. 137–150.

7. M. Zaharia et al. “Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing”. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. NSDI’12. San Jose, CA: USENIX Association, 2012.

8. Stanford Network Analysis Project, LiveJournal social network. 2006. url: http://snap.stanford.edu/data/soc-LiveJournal1.html.

9. L. Backstrom et al. “Group Formation in Large Social Networks: Membership, Growth, and Evolution”. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’06. Philadelphia, PA, USA: ACM, 2006, pp. 44–54. isbn: 1-59593-339-5. doi: 10.1145/1150402.1150412.

10. Sampling Online Social Networks, Facebook Social Graph. 2009. url: http://odysseas.calit2.uci.edu/doku.php/public:online_social_networks#facebook_social_graph_-_mhrw_uni.

11. M. Gjoka et al. “Walking in Facebook: A Case Study of Unbiased Sampling of OSNs”. In: Proceedings of the 29th Conference on Information Communications. INFOCOM’10. San Diego, California, USA: IEEE Press, 2010, pp. 2498–2506. isbn: 978-1-4244-5836-3. doi: 10.1109/infcom.2010.5462078.

12. J. E. Gonzalez et al. “PowerGraph: Distributed Graph-parallel Computation on Natural Graphs”. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. OSDI’12. Hollywood, CA, USA: USENIX Association, 2012, pp. 17–30. isbn: 978-1-931971-96-6.


Review

For citations:


Pastukhov R., Korshunov A., Turdakov D., Kuznetsov S. Improving quality of graph partitioning using multilevel optimization. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2014;26(4):21-32. (In Russ.) https://doi.org/10.15514/ISPRAS-2014-26(4)-2



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)