PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Carlini E., Dazzi P., Esposito A., Lulli A., Ricci L. Balanced graph partitioning with Apache spark. Luís Lopes, Julius Žilinskas, Alexandru Costan (et al...) (eds.). (Lecture Notes in Computer Science, vol. 8805). Porto, Portugal: Springer, 2014.
 
 
Abstract
(English)
A significant part of the data produced every day by online services is structured as a graph. Therefore, there is the need for efficient processing and analysis solutions for large scale graphs. Among the others, the balanced graph partitioning is a well known NP-complete problem with a wide range of applications. Several solutions have been proposed so far, however most of the existing state-of-the-art algorithms are not directly applicable in very large-scale distributed scenarios. A recently proposed promising alternative exploits a vertex-center heuristics to solve the balance graph partitioning problem. Their algorithm is massively parallel: there is no central coordination, and each node is processed independently. Unfortunately, we found such algorithm to be not directly exploitable in current BSP-like distributed programming frameworks. In this paper we present the adaptations we applied to the original algorithm while implementing it on Spark, a state-of-the-art distributed framework for data processing.
URL: http://link.springer.com/chapter/10.1007%2F978-3-319-14325-5_12
DOI: 10.1007/978-3-319-14325-5_12
Subject Graph Processing
Distributed Graph Partitioning
Distrbuted Algorithms
C.2.4 Distributed Systems


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional