Links for Crafting Software

As a practitioner of computer science and software engineering, I am always looking for opportunities to learn. I also consider learning from my peers valuable. Here are some interesting articles that I read recently.

  • API Design Matters
  • The way to keep designers sharp and honest is to make them eat their own dog food. Any process that deprives designers of that feedback is ultimately doomed to failure.

    Joshua Bloch  has given a talk on the same topic. API design was also the cause of a tiff between James Gosling and Paul Buchheit sometime back.

  • A Conversation with Steve Bourne, Eric Allman, and Bryan Cantrill – Part1 & Part2
  • All three of them did their best work to write popular tools because they needed to use it. Notice how this validates the point that I highlighted from the previous article.

  • Real-World Concurrency

    The authors have a single system view of the world though. Many of the concurrency ideas presented also applies to programming in a distributed computing environment. Systems that run MapReduce and Hadoop like workloads could use these as well.

Partly because most of the links from ACM and the ACM Queue are useful and partly to heed to Bryan Cantrill’s call to join ACM, I recently became a professional member of ACM.

[ Cross posted at http://blogs.sun.com/augustus ]

Tags: , ,

Advertisements

Modelling a software problem

A solution to every software problem must have two characteristics, it must be correct and and it must be efficient. One of the most important steps of solving a problem is to model the problem. A good software model of the problem determines how the problem gets solved and how efficient the solution is. Right modelling requires a good knowledge of software techniques and tools.

An engineer with good knowledge of data structures and algorithm techniques should be able to the model a single problem in multiple different ways, be able to see the trade-offs in the different models and pick a model using which the problem can be solved. Once the model has been decided, then a variety of tools can be used to implement the solution.

Here is a example. Facebook lists a variety of problems in their puzzles page that is ordered by how hard it is to solve the puzzle. Let us look at one of the hardest puzzles,  "FaceBull". The problem is to find the cheapest way to produce all the chemical compounds needed for the new super-energy drink from any single source compound. How would you model this problem?

Let each compound to be a vertex of a graph. There will be a edge between any two compounds, when there is machine that converts one compound to another. The weight of this edge is the cost of acquiring the machine that does the conversion. Let us draw this graph with the example input given in the problem.

Now in this model, the problem becomes, starting from any vertex find a path with the minimum cost that visits every vertex of this graph atleast once. This is a variation of the classic traveling salesman problem(TSP). Traveling salesman problem is a NP-complete problem. No wonder Facebook marked this as a hard problem.

We have identified a well known and well studied problem. The brute force approach to solve the problem is in the order of O(n!), where n is the number of vertices. It will take years for such a program to complete for just 20 vertices. Although it is difficult to solve TSP and to find the optimal solution for all inputs, there are many heuristics and approximation algorithms that will give a solution that is close to the optimal solution.

Notice how we transformed a chemical industry application problem into a well studied computer science problem with the right modelling.

When Nick and Thorsten were visiting the bayarea for Open HA Cluster Summit, we talked about some of the difficult problems in Solaris Cluster and how we may model them. This is one of the exciting aspects of working at Sun and at Solaris Cluster. There are many difficult problems to solve and people are eager to solve it. I  love the opportunity that every difficult problem presents. Stay tuned, we are just getting started.

[ cross posted at http://blogs.sun.com/augustus/  ]

Shared Nothing Storage in Open HA Cluster

Two years back I led and designed a project to make Solaris Cluster easy to use. The wizards that resulted from this effort is a key element of the Solaris Cluster user experience now. Many new projects want their features supported using the wizards today. The architecture of the project even made it to the IEEE Cluster conference.

This past year I led another important effort for Solaris Cluster. This feature, Shared Nothing Storage, that was released as part of Open HA Cluster 2009.06 removed a major hardware requirement for the cluster: the necessity to have a shared storage. This was achieved by configuring the iSCSI protocol stack present in COMSTAR in a particular fashion and layering a ZFS mirror on top. This feature allows a user to use any local disk present in the system as a storage for the service and to make that service highly available.

There is no need to turn disk fencing on for this configuration and therefore it also removes the need to have SCSI reservations. Here is a picture of the configuration, with detailed configuration instructions here.

The key challenge in providing this feature was to make the cluster device subsystem robust enough to handle devices that are attached via the network. The design details are present here.

This configuration becomes more interesting when I/O multipathing is configured, because it shows the flexibility and the power of the COMSTAR architecture. With COMSTAR, a single logical unit of storage can be accessed via multiple port providers, multiple iSCSI targets in this case. These multiple iSCSI targets can be used to create multiple paths to the same logical unit. This provides fast mirroring of data in the cluster configuration. If you want to understand the different configurations with multi-pathing, Aaron Dailey and Scott Tracy have a excellent white paper on using MPxIO on Solaris. Here is a picture of the cluster configuration with I/O multipathing.

 Try it out. Join the discussions at ha-clusters-discuss@opensolaris.org