Back To Schedule
Sunday, October 22 • 2:00pm - 2:30pm
MCL Clustering of Sparse Graphs

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
The increasing need for clustering in several scientific domains has inevitably driven the creation of innovative algorithms, each designed to perform more efficiently in certain applications. More specifically, in many applications, the data entities involved can be portrayed effectively by a graph as a collection of nodes and edges. One of the most established algorithms for graph clustering problems is the Markov Cluster Algorithm (MCL).

When dealing with large and complex datasets, the underlying graphs can easily reach proportions that independent computing systems are inadequate to deal with. Additionally, the graphs encountered are typically sparse: the number of edges is far smaller than might be possible in a fully-connected graph. Consequently, there is a concrete need for algorithms that are designed to handle sparse graph clustering utilizing distributed computing resources.

Our motivation was the development of a distributed architecture, able to accommodate large and sparse graphs, to actualize the MCL and R-MCL algorithm. The Apache Spark framework was chosen due to its ability to utilize distributed resources and its proven track record.

Although Spark is a framework capable of handling massive datasets, it currently does not provide rich support for computation with sparse matrices and sparse graphs. Hence, methods have been implemented to enable the exploitation of sparse adjacency matrices in distributed sparse matrix multiplication, a critical component of MCL. The proposed solution can handle arbitrarily large inputs, provide almost linear speed-up with the addition of computational resources and output results directly comparable to the non-distributed reference MCL implementation.

avatar for Athanassios Kintsakis

Athanassios Kintsakis

Machine Learning Engineer, Capital One Financial
Athanassios Kintsakis is an ECE BSc/MSc graduate, and Ph.D. Candidate in the field of statistics and machine learning applied in bioinformatics at the Aristotle University of Thessaloniki. He has co-authored numerous journal publications, presented at international conferences and... Read More →

Sunday October 22, 2017 2:00pm - 2:30pm PDT
Room C Room C