I mentioned earlier that the puzzle named “Facebull” in Facebook, is a variation of the traveling salesman problem. Let us define the facebull problem more formally and see if it is really true.
As I mentioned before, this problem can be modeled as a graph problem, where every chemical compound is a vertex and every machine is a directed edge in the graph. The cost of the machine is the weight of the edge in the graph. In this model, the following information is given in the problem definition.
- The given graph is a weighted strongly connected directed graph, because every vertex can be reached from any other vertex.
- The given graph need not be a fully connected graph, because every pair of vertices need not have a edge.
Now, the following are the unknowns,
- We want to find the set of edges among the given set of edges, such that every vertex can still be reached from any other vertex.
- When there are multiple such subsets, we want to find the subset that is the minimum weight sum of its edges.
Thus facebull(FB) problem can be stated concisely as below.
Given a strongly connected directed graph G = (V, E) with vertex set V and weighted edge set E, find the strongly connected subgraph G’ = (V, E’), with E’ ⊆ E having minimum total weight.
The definition of a asymmetric travelling salesman problem(TSP) is as below.
Given a fully connected directed graph, find the minimum cost tour that visits every vertex exactly once and returns to the starting vertex.
A tour that visits every vertex of a graph exactly once is also known as a hamiltonian cycle.
Any fully connected graph is also strongly connected, so the inputs to TSP and FB could be the same graph. Usually in a real TSP the distances between the vertices are Euclidean distances, this is not true for the FB problem.
The main difference between the TSP and the FB problem is the unknown in the problem. In TSP, the unknown is a cycle in the graph. In FB problem, the unknown is a set of edges, with the only constraint that the resultant graph be strongly connected. Another key difference is, every time a edge is used in a TSP solution, a cost is added to the solution. However, a edge that is added to the FB solution is a machine that is bought and therefore could be used multiple times at no cost.
Finally, consider the computational complexity of TSP and the FB problem. TSP belongs to the class of NP-Hard problems, it is O(n!) in its worst case, where n is the number of vertices of the graph. For a graph of 12 nodes, the program must consider 12! = 479001600 possibilities. A brute force solution that considers all the possibilites for a graph of 20 nodes, will take many years to complete!
There are 2^m possible subsets of edges in a graph of m edges. Therefore, the complexity of FB is 2^m, where m is the number of edges of the graph. If the given graph is fully connected, m = n², where n is the number of nodes. Therefore FB is O(2^n²) is its worst case. For a graph of 12 nodes, a solution to FB must consider 22300745198530623141535718272648361505980416 possibilites. O(2^n²) is worse than O(n!). FB is a harder problem than TSP!
Another way of saying this is, the size of the search space for a FB solution is 2^m. The solution can be any where within that space, so any correct algorithm must guarantee to be able to find the solution any where in that space. An algorithm can restrict the size of its search space, but only if it can prove that the solution cannot be in the unsearched area. I will talk more about some of these pruning techniques in a future blog entry.
[cross posted from my Sun Blog]