High Availability Rails Cluster

Tuesday, 08. 19. 2008  –  Category: all, sw, web

I’ve been asked about this a few times, so I figured I’d post here. This is a brief description of a highly available Rails cluster I’ve built. Some preliminaries:

  • There’s no invention here, I believe this setup is very common.
  • High availability isn’t the same thing as load balanced. There is nothing here to intelligently shared load across the frontend servers, and one backend server is essentially idle all the time.
  • This cluster is built with a bunch of open-source software on non-fancy kit. As such it doesn’t have the enormous capacity of clusters built upon commercial shared-storage products, SAN kit, layer 7 web switches etc. Its ambition is to run a few busy Rails sites well whilst coping with hardware failure gracefully.

Layout

Operation

  • Web traffic is spread across the managed frontend interfaces by multiple A records in the DNS.
  • Wackamole uses a Spread messaging network to ensure these multiple A record IPs are always present across the frontend. It achieves this by managing the hosts’ interfaces when it detects hosts joining or leaving the cluster.
  • A pair of MySQL servers run in master:master configuration on the backend hosts
  • The backend hosts use DRBD to maintain a mirrored block device between them.
  • These block devices back a NFS filesystem.
  • Heartbeat runs on the backend hosts to do several tasks:
    1. Manage which host is the DRBD primary and therefore can be written to.
    2. Manage which host has the DRBD filesystem mounted and exported with NFS.
    3. Manage the IP through which the frontend mounts the filesystem and talks to MySQL.
  • With all this in place, Nginx accepts web connections and serves static assets off the NFS mount and passess other requests to Mongrel, a HTTP server that’s well suited to running a Rails instance.

Notes

  • One of the main hazards of MySQL master:master setups is primary key collision if an INSERT occurs on both hosts at once. We avoid that here by letting Hearbeat manage the IP that the frontends connect to.
  • I’ve built two of these clusters to date. The second one is now four servers wide on the frontend.

Future work

  • DRBD can now run in dual-primary mode, allowing both hosts to accept writes. This makes it a candidate for filesystems like GFS that use shared storage to present a filesystem that can be written to on multiple hosts. More here.
  • To add some load balancing I’m considering using HAProxy or LVS to actively distribute traffic across the frontends.
  • HA aside, there’s also some cool things like evented Mongrel that it would be interesting to try.

One Response to “High Availability Rails Cluster”

  1. Roderick van Domburg Says:

    What you can do to prevent MySQL ID collisions is configure the nodes to increment ID’s by two and having the two nodes start on different counts. At RailsCluster we too let Heartbeat manage the IP but do the ID trick just in case a network partition occurs.

Leave a Reply