The Maximum Independent Set Problem

The maximum independent set problem, finding a maximal set of vertices in a graph such that there is no edge between any two vertices in the set, is one of the basic NP-hard optimization problems and has been extensively studied in the literature, in particular in the line of research on worst-case analysis of algorithms for NP-hard optimization problems. The most well-known runtime complexity algorithm uses a branch-and-reduce paradigm, applying several critical reductions to the graph to reduce its size, and solving smaller instances independently, to find a solution to the initial problem to form The aim of this thesis is to study the algorithm and implement an efficient program for computing the Maximum Independent Set on large graphs.

Panagiotis Stamatopoulos for his support and guidance in shaping and successfully completing this thesis. This project was developed in Athens, Greece between the months of January and October of the year 2018. In the initial phase, research was done to find and evaluate the current literature on the problem.

Once the publication on which the implementation of the thesis program was to be based was created, it was important to thoroughly study and understand it. This would lead to the next step, which was to implement an efficient program, according to the algorithm described in the publication.

INTRODUCTION

RELATED WORK

Exact solutions

Linear time

These algorithms are efficient and able to generate independent sets much larger than other existing linear time algorithms while having a similar running time.

Memory efficient

THE ALGORITHM

Description

Implementation

As mentioned earlier, the essence of the algorithm is to recursively reduce the graph to two smaller instances and find a maximal independent set in the first ones, to form it for the original graph. Each SearchNode structure consists of a graph instance, an independent maximal array for it (empty initially), the type of branching rule to be applied to it, whether or not it contains a minimal vertex cut, and the index position of the parent string of his. , and her potential children left and right. Then its left child is generated, to which a reduced graph of the current one is assigned.

This continues recursively, until the graph of a left child is trivial enough to compute a maximal independent set on it. This process recursively repeats until the backtracking reaches a SearchNode from which the maximum independent sets of both children have been computed. At that point, depending on whether or not there was a suitable minimum vertex cut in that graph instance, its maximum independent set is either the largest set between the maximum independent set of its children, or its union in the case of a cut.

Obtaining the neighbors of a node is one of the most fundamental operations on a graph structure and a very common requirement for graph algorithms. Both mutate a GraphTraversal object and iterate through the nodes of the graph, as their names suggest, ignoring removed or zero-degree nodes. An example of the ease of traversing all active edges of the graph can be seen in Figure 14.

If a simple example or a line graph reduction is applied, a maximum independent set is computed on the spot. We use brute force for an easy example, counting all possible independent sets within it and keeping a maximum, but a maximum independent set can be found much faster within a line graph reduction. Because of the structure of the graphs applicable to that reduction, we can simply choose arbitrary nodes to include in the set until we can no longer choose any, and we will always result in a maximally independent set, regardless of the choices our initial or subsequent.

This is because, depending on whether the Hypernode will be included in the maximum independent set or not, we will need to decompose it and select the appropriate internal nodes of it to be included, since we want our solution set refer to the original chart. , which had no hypernodes. When decomposing a Hypernode and choosing to include some of its internal nodes in the maximally independent set, some of them may be the Hypernode itself, so the decomposition must be performed recursively. We implement Tarjan's algorithms [9, 10] for vertex cuts of size 1 and 2, which are based on depth-first search and run in O(n+m) time, where and denote the number of vertices and edges of the chart respectively.

PROGRAM USAGE

Execution

Input

Output

EXPERIMENTAL EVALUATION

In Figure 10, we want to see how the number of nodes affects performance by testing the program on a graph with 1 million nodes. Again we see very fast runtimes with the right amount of edges until we add 1380000 edges where we reach 14 minutes. We also tried 1390000 edges and the runtime was estimated to be more than 5 hours.

Regarding the program's memory usage, Figures 11 and 12 show the memory consumed for graphs of 500 vertices and 1 million vertices, respectively, for varying numbers of edges. We observe that memory increases dramatically and suddenly, very similar to the behavior seen in the time performance plots. Memory consumption is expected to be high, since each SearchNode in the search tree consists of an entire graph instance.

Although, since we are using depth-first search, we only need to maintain one SearchNode for each depth level of the tree. So the maximum memory used is limited by the depth of the search tree, and not by the total number of SearchNodes. From the above time and memory experiments, we conclude that our program runs fast on sparse graphs regardless of the number of nodes.

However, even a slight increase in edges can quickly result in much longer times and memory usage. Time and memory are expected to be proportional, since the larger the search tree, the longer it will take to find a solution, and the more memory will be required to store all these graph instances. An exception to this rule is complete or nearly complete graphs, on which the program can run very fast due to the reduction step of the algorithm.

Finally, in Figure 13 we check the size of the maximum independent set for graphs of 500 nodes. As expected, we see that adding borders results in a decrease in size or no change at all, as more borders limit the problem even further. This explains why at some point we see an increase in maximum independent magnitude, rather than a decrease or no change.

Figure 9: Execution time for 1000 vertices and varying number of edges

CONCLUSIONS AND FUTURE WORK