Istituto di Scienza e Tecnologie dell'Informazione     
Lulli A., Ricci L., Bertolucci M. Current flow betweenness centrality with Apache Spark. In: ICA3PP 2016 - Algorithms and Architectures for Parallel Processing. 16th International Conference (Granada, Spain, 14-16 December 2016). Proceedings, pp. 270 - 278. Jesus Carretero, Javier Garcia-Blas, Ryan K.L. Ko, Peter Mueller, Koji Nakano (eds.). (Lecture Notes in Computer Science, vol. 10048). Springer, 2016.
The identification of the most central nodes of a graph is a fundamental task of data analysis. The current flow betweenness is a centrality index which considers how the information flows along all the paths of a graph, not only on the shortest ones. Finding the exact value of the current flow betweenness is computationally expensive for large graphs, so the definition of algorithms returning an approximation of this measure is mandatory. In this paper we propose a solution that estimates the current flow betweenness in a distributed setting using the Apache Spark framework. The computation is defined and organized following the Gather Apply Scatter model. Our experimental evaluation shows that the algorithm achieves high correlation with the exact value of the current flow betweenness and outperforms other algorithms.
URL: http://link.springer.com/chapter/10.1007/978-3-319-49583-5_21
DOI: 10.1007/978-3-319-49583-5_21
Subject Centrality Measure
Thinking like a vertex
Apache Spark
H.2.8 DATABASE MANAGEMENT. Database Applications. Data Mining

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional