Threading In Ruby

As you may recall, or have otherwise taken note, I recently launched Waves, a Ruby-based open-source Web app framework. One of the interesting points of feedback has related to the whole issue of thread-safety. Now, to begin with, I’ve briefly addressed this issue on the web site and in an interview. But the fact remains that there seems to be a lot of hostility and also confusion about using threads.

This is a little bit weird for me. I’ve been around long enough to remember when threading was a very obscure experimental feature that was never available on the more mundane kinds of projects I was working on. And then, as it became more mainstream, it was difficult to make use of, either because of problems with the implementation, the compiler, or the application code. Finally, in the late nineties, especially with the rise of the Web, it became one of those things that everyone did, even if they didn’t really need it.

We seem to be experiencing some kind of backlash at this point. Threads were cool for awhile, but now they’re evil. In the Ruby community, some of the cool kids are really serious about thread support and some of them aren’t. At the same time, Ruby itself is in a major transition when it comes to threading, which just adds to the confusion.

One thing is clear: Ruby hasn’t been very good at threading in the past. Ruby itself uses “green threads,” which is basically faking it. Rails, Ruby’s flagship library, only supports threading as an after-thought. So, in a sense, it isn’t at all surprising that there has been a lot of effort in the Ruby community to find alternatives. There really wasn’t much choice. In the process, a lot of folks have begun to wonder if threads are really necessary. It’s sort of like having your TV break and then after a few days of not having it, you start realizing you can live without Survivor after all. But just because you can live without threads, doesn’t mean you should.

There seem to be three main arguments against using threads, outside of the lack of support in Ruby and Rails. The first is the “threading is hard” argument, which basically says that threading tends to lead to hard to find bugs. The second is the “processes, not threads” argument, which states that it is just as fast, or faster, to run everything in separate processes anyway. The third is the “hardware is cheap” argument, which basically says that, nowadays, you can just throw hardware at a scalability problem anyway, so who cares?

While there is a learning curve associated with threading, I don’t see how it is any different than with event-driven processing. For example, in a threaded app, you have to use a mutex around a mutable class variable. But in the event-driven model, you can’t really have a mutable class variable because the other processes won’t see the changes to the value. Either way, a naive programmer can get themselves in trouble. Basically, there is a learning curve associated with any kind of concurrency, not just threads. Furthermore, Web apps actually lend themselves to concurrency. The fact that you can handle separate requests in separate processes without any IPC between them actually proves the point: they don’t need to share anything. Thus, it’s straightforward for a Web app framework to support a highly-efficient thread-per-request model.

Event-driven processing is actually faster in Ruby, but that is more a function of Ruby’s weak support for threading than anything intrinsic to the architecture. JRuby is becoming mature enough to bring threading back into the picture, and I am looking forward to running multithreaded apps against ebb event-driven apps in JRuby. And even if even-driven processing is faster, I’m not sure it’s actually cheaper which is what scalability is really all about. Deployment of event-driven applications is more complex, they take up more memory, and potentially introduce expensive context-switches into the picture. It also isn’t clear that multiple CPUs can take advantage of processes as easily as they can threads.

Neither Java nor PHP have ever needed event-driven architectures, and they remain the de facto standard. The “hardware is cheap” argument fails miserably when confronted with real world examples. While it may be true that hardware is cheaper than developer time, many applications already require tens of thousands of dollars of hardware and full-time sysadmins to keep it all running. If you can get by with 2 servers instead of 15, why not? And if you can get buy without having to hire a dedicated sysadmin, why wouldn’t you? Even if you did need to hire an extra developer to help iron out some concurrency bugs, it would still be a win because the developer can also add features to your application or fix other kinds of bugs.

It is one thing to benchmark a “Hello World” application and say you can do several thousand requests per second, or whatever. But real apps typically have other bottlenecks besides just the request handling itself and they aren’t going to hit 1,000 requests per second on a single CPU machine. At 50 or 100 requests per second per CPU, the concurrency requirements are magnified by an order of magnitude, so even small differences in efficiency are worth exploring. Being dogmatic about it and simply ruling out threading as option doesn’t really make sense. Which is what some advocates of event-driven architectures seem to be doing.

A lot of smart people with a lot of good experience are working to make threading an realistic option in Ruby. It is only a matter of time. Perhaps Ruby’s story is really just beginning. The emergence of viable threading models in Ruby is potentially a quantum leap forward for the language. While I don’t think Ruby should aim to be PHP or Java, by any means, I do think it might be worth emulating the things those communities did well. And that includes using threads to support high-levels of concurrency with simple deployment models. Maybe, in the end, the Ruby community may come up with something better: that would be awesome and very much in the spirit of Ruby. But I’ll need more than a ‘Hello World’ benchmark to be convinced.