How to scale Rails

So we established previously that Rails really can scale!  Now it’s time to look at how to do it.  I’ll list here the basic steps and then follow up with a post on each of these:

  1. Of course, Memcached helps a ton!
  2. Start with a master database and add two or three slave databases.  Split your reads from your writes. Rails has a great plugin for this that makes it about as easy as it can be (certainly easier than PHP). This will move you to being able to handle 15 or so app servers.
  3. Denormalize some of your data to reduce query intensity.
  4. Shard your data. Put statistics/usage data on a totally separate set of DB boxes from your user’s data.  If you really are the next MySpace,  check out AsterData for offloading some of that work.
  5. Switch to a commercial DB setup. If you’re still desperate to expand, then a more robust DB is probably in order. The great part: Rails is database agnostic (at least far more so than PHP or ASP) and will happily switch without too much effort.

Rails lets you write really bad SQL

So last post, we established that Rails’ Achilles heel is in scaling the DB. PHP, Perl, ASP, etc. all rely on a database behind the scenes, and all need to scale their DB alongside the rest of the application. Rails often gets the bad rap though.
Why? I can think of only one reason:
Rails lets you write REALLY bad SQL very easily
Which means we wind up with Rails applications that are dying a slow DB
death much sooner than their counterparts that use a different
language/framework. Once again, I’m going to play the “Not Rails’
fault
” card. When you have developers with no database experience (and no
desire to learn about them) start developing database-backed web apps,
there’s going to be an major problem! RDBMS’s are extremely complicated and to
build a successful Rails app means you need to have at least a moderate
understanding of what is going on under the hood. Without that understanding,
your Rails app will never scale. But a little bit of study and appreciation for
what is really going on will allow you to quickly spot what is and what is not
acceptable Rails code.

Keeps these thoughts in mind and your Rails app will scream, scale, & [some
other s-word] all the way to becoming the next Twitter.

Why Rails really can scale

There have been a number of posts throughout the community recently talking about scaling Rails. Of course this has been an endlessly debated issue for years now, but it is coming back to the forefront thanks to news like Blaine Cook’s recent departure from Twitter (and Twitter’s many other recent issues). I feel it’s time to throw my two cents in on the matter.

First a little background: I’ve deployed countless Rails applications, from tiny deployments with a single VPS to several 10+ server setups. For those who scoff and say that 10 machines is small…they’re probably right, but not everyone can be as lucky as Twitter. But these sites are all easily capable of handling multi-million pageviews per day, which is enough for 99.9% of the sites on the web.


Now to begin with, let’s talk about why Rails is GREAT at scaling:

1. Rails has a very clean line between what belongs on the ‘web’ server and
what belongs on the ‘app’ server.
A lot of people have a major problem with the fact that Rails can’t be served by a typical web server (like Apache + mod_php or IIS/ASP). But the truth is that as soon as you grow beyond two servers, it
becomes a great benefit to have a clear and easy to follow line about what the web server’s job is and what the app server’s job is. It’s easy to delegate responsibilities to the correct server that can do the best job and this separation is of instant benefit. For those who still don’t like it: Passenger/mod_rails is making great strides.

2. Rails can scale out very easily.
Adding additional app servers is very easy, thanks to the fact that you must already have a reverse proxy set up. And the scaling is about as linear as it comes. The app servers and web servers are already ‘share-nothing’ (besides maybe the caching)…so it is common sense & almost second nature to add more boxes as needed. For more info, see It’s boring to scale with Rails.

So where did Rails get this bad rap about scaling?

I suspect it comes from two issues:

1. Ruby is slow.
ActiveRecord is slow. Rails is not-fast. This really has nothing to do with scaling. All together, this means that to handle a given load, you will need more hardware than a comparable app written in Perl or PHP. This issue is about giving up a piece of performance to get a much larger piece of productivity & maintainability
2. RDBMS’s do not scale well.
AHA! We run into the largest problem most people have and where a lot of the argument seems to stem from. Eventually, you will expand to the point where your one little MySQL box just can’t handle it any more. It is absolutely saturated and we start having all sorts of problems.MySQL is not terribly easy to scale. Postgres isn’t much better. This is not Rails’ fault. Of course, it doesn’t matter whose fault it is, if Rails needs a database and your database doesn’t scale, then there’s a definite problem. Fortunately, there are plenty of things that you can do to deal with these problems.

In the very near future, I’ll talk about the two things that you can do to turn Rails into the best-scaling framework out there.