HAProxy is a high availability, software-based HTTP load balancing tool that I’ve seen gaining a lot of traction in large server cluster and cloud computing environments. I’m currently using it as part of a pre-built, cluster image that a third party vendor is maintaining, and it’s performance impressed me enough that I’ve started to look into its capabilities further. Because it’s a software solution, it gives you a lot of flexibility to customize it’s configuration.
One of the neat features I came across is a configuration that will allow you to reboot servers in a cluster without a single user experiencing a 404 error, down-time, or lost sessions. The trick is to use an iptables rule to have Apache respond to two ports, say 80 and 81. Apache really runs on port 80, and then port 81 is configured to forward to port 80. HAProxy is then configured to use the application server’s port 81, and the same server at port 80 is defined as the hot backup.
The igvita.com blog has a good howto on doing just this:
Instead of specifying a physically different app server, we’re going to define our backup instance to be the exact same application server in each case, but with one minor difference: the status port, for the main app server will be different from the one we use on the backup.
Now, if we want to put the server into maintenance mode, we remove the IPTables rule for the forwarded port, and wait a few seconds so that our upstream HAProxy instance recognizes that the server is no longer available for new connections – this is key, it means that no client is dropped in the process. Now, once the server is out of rotation in HAProxy, we can do a graceful restart, add the IPTables rule back in, and we’re live!
What’s cool is that without any reconfiguration on the proxy, you can pull a machine offline gracefully. You simply disable the iptables port forward, HAProxy will notice that port 81 went offline and start sending existing users to port 80 with their current cookies. In reality, it’s the exact same Apache instance, so all session information remains intact. New sessions will all be sent to your other servers, and you can wait until nobody is left using the maintenance-mode machine before taking it offline.