A solution to every software problem must have two characteristics, it must be correct and and it must be efficient. One of the most important steps of solving a problem is to model the problem. A good software model of the problem determines how the problem gets solved and how efficient the solution is. Right modelling requires a good knowledge of software techniques and tools.

An engineer with good knowledge of data structures and algorithm techniques should be able to the model a single problem in multiple different ways, be able to see the trade-offs in the different models and pick a model using which the problem can be solved. Once the model has been decided, then a variety of tools can be used to implement the solution.

Here is a example. Facebook lists a variety of problems in their puzzles page that is ordered by how hard it is to solve the puzzle. Let us look at one of the hardest puzzles, "FaceBull". The problem is to find the cheapest way to produce all the chemical compounds needed for the new super-energy drink from any single source compound. How would you model this problem?

Let each compound to be a vertex of a graph. There will be a edge between any two compounds, when there is machine that converts one compound to another. The weight of this edge is the cost of acquiring the machine that does the conversion. Let us draw this graph with the example input given in the problem.

Now in this model, the problem becomes, starting from any vertex find a path with the minimum cost that visits every vertex of this graph *atleast* once. This is a variation of the classic traveling salesman problem(TSP). Traveling salesman problem is a NP-complete problem. No wonder Facebook marked this as a hard problem.

We have identified a well known and well studied problem. The brute force approach to solve the problem is in the order of O(n!), where n is the number of vertices. It will take years for such a program to complete for just 20 vertices. Although it is difficult to solve TSP and to find the optimal solution for all inputs, there are many heuristics and approximation algorithms that will give a solution that is close to the optimal solution.

Notice how we transformed a chemical industry application problem into a well studied computer science problem with the right modelling.

When Nick and Thorsten were visiting the bayarea for Open HA Cluster Summit, we talked about some of the difficult problems in Solaris Cluster and how we may model them. This is one of the exciting aspects of working at Sun and at Solaris Cluster. There are many difficult problems to solve and people are eager to solve it. I love the opportunity that every difficult problem presents. Stay tuned, we are just getting started.

*[ cross posted at http://blogs.sun.com/augustus/ ]*

Hey Augustus .. Very interesting post ! I agree it is difficult to arrive at an optimal solution for the above problem .. May be if we can arrive at a certain threshold and find solutions (paths) whose weights are less than the threshold we will have lesser number of solutions which can be optimal or slightly lesser than optimal … Just a thought …

Hi Ganesh,

Thank you. Most likely facebook would be looking at the implementation to see how many vertices your program can handle. There is no space restriction mentioned in the problem. If I were implementing this, I would try the dynamic programming approach. Another approach would be to assume the triangle inequality and try the minimum spanning tree approach. These solutions and more approaches are discussed in the wikipedia entry that I pointed to.

http://en.wikipedia.org/wiki/Travelling_salesman_problem#Computing_a_solution