Linear Time Vertex Partitioning on Massive Graphs

Int J Comput Sci (Rabat). 2016;5(1):1-11.

Abstract

The problem of optimally removing a set of vertices from a graph to minimize the size of the largest resultant component is known to be NP-complete. Prior work has provided near optimal heuristics with a high time complexity that function on up to hundreds of nodes and less optimal but faster techniques that function on up to thousands of nodes. In this work, we analyze how to perform vertex partitioning on massive graphs of tens of millions of nodes. We use a previously known and very simple heuristic technique: iteratively removing the node of largest degree and all of its edges. This approach has an apparent quadratic complexity since, upon removal of a node and adjoining set of edges, the node degree calculations must be updated prior to choosing the next node. However, we describe a linear time complexity solution using an array whose indices map to node degree and whose values are hash tables indicating the presence or absence of a node at that degree value. This approach also has a linear growth with respect to memory usage which is surprising since we lowered the time complexity from quadratic to linear. We empirically demonstrate linear scalability and linear memory usage on random graphs of up to 15000 nodes. We then demonstrate tractability on massive graphs through execution on a graph with 34 million nodes representing Internet wide router connectivity.

Keywords: Vertex partitioning; graph cuts.