Evolution vs Revolution

Posted: Sun, 15 January 2006 | permalink | No comments

To rewrite, or not to rewrite, that is the question. If there's one thing programmers seem to love doing, it's rewriting a piece of software from scratch. Which is funny, because "the industry" spends a lot of time bemoaning duplication of effort and generally touting "reusability" and "component-based software development".

Joel on Software has a fairly good treatise on why rewriting software, in general, is a Bad Thing. I may not agree with him all the time, Joel is always worth a read, and I certainly can't fault his reasoning on this issue.

My primary experience with rewriting software, as with most things software-related, is in the Free Software world. Although I've been involved in (or led) rewrites of proprietary software in the past, I've seen a lot more "let's rewrite it all" death-marches in FOSS. While I wouldn't characterise them universally as "failures", they don't seem to have the beneficial effect the authors were probably hoping for.

What does seem to happen, in these FOSS projects, is that the developers squirrel themselves away in their backyard labs, using the latest and greatest in software engineering processes and tools, to produce... the same end result. (If you produce something totally different, it's not a rewrite, it's a new piece of software that does much the same things as the old one). When I say "the same end result", I mean exactly that -- it does the same thing, the code is still subject to coderot over time, and so on. It doesn't seem like much useful output has been produced. So why do we do it? Joel gives some good reasons (and why they're not such good reasons).

Rewrites always take a very, very long time to do. It's in their nature. They're also very demoralising -- you're not really creating anything new, you're just recreating existing functionality. In the beginning, though, it's amazingly exciting -- a whole new world of bugfree code, an engineering marvel, something for the annals of software history.

But several months later, as your new code is becoming cruftier by the day, and you're just duplicating functionality that already exists, you get bored. And boredom is the death knell of FOSS projects. I presume that this excitement-fading-to-boredom progression is why everyone starts rewrites, but they're often abandoned (or, at least, take a lot longer to complete than estimated). I'd make a comparison to that favourite target of unfair comparisons, marriage, but I know my wife reads my blog, so I won't. <grin>.

This boredom factor is what has affected the python-appserver backed rewrite of IRM (from php3-style PHP). I took on maintenance of the existing PHP codebase over a year ago, and the rewrite had been in progress for a while at that point. IRM now has a bunch of useful new features, and there still isn't any indication of a useable tarball emerging from the rewrite.

A rewrite in progress is a terrible thing. It tends to make people wary of working on the existing codebase, if it's going to be a deadend. Without something to work on, they'll tend to wander off to other projects. However, sometimes (due to existing deployments), users will work on the deadend. If the developers have been more-or-less neglecting the existing codebase while they take on their glorious rewrite, then the rest of the world is going to have been playing with and tweaking the existing code, adding their own improvements and modifications. The result of this, when the new rewrite finally releases, is that there's great wodges of third-party code that's incompatible with the rewrite, and a large chunk of your userbase says "stuff it" and sticks with what they know.

This is what looks like happening in the world of SysCP. A few days ago, I found out that they're planning on doing a rewrite from the functional-spaghetti-PHP of the current version to the cult-of-MVC-PHP. This is nice, but their current estimate is of a first RC release in February. I take this estimate with a boulder of salt -- for the full collection of usual reasons (boredom, real life, overoptimism, etc). The fact that there's not a huge amount of movement on the commit log doesn't inspire confidence, either, and despite early claims of a massive development team, only one person has been committing recently.

In the meantime, I'm tasked with adding a number of small features to the current SysCP codebase, which we need for work. When the rewrite comes out, it's highly unlikely that there will be a business case for rewriting my features for the new code, and it's not likely that the features we need will make it into core. So, SysCP will probably fork -- the old, featureful, crufty codebase, vs the new, shiny codebase. If there's sufficient new features to warrant an upgrade, we'll take the plunge, but what SysCP does now is very close to what we want, so it'll take a fair amount of shiny! to get us to move.

This is a general issue of software development in FOSS -- if you take your eye off the ball, for any reason, someone else will probably pick it up and run with it. Your rewrite will be a completely new project, and you'll get some of the users and the fork of your old codebase will continue to live on in another guise.

Let's face it, though -- your code is going to accumulate cruft. Over time, it morphs into this bletcherous pile of huge functions, multi-purpose classes, and reams and reams of bug reports. How do you deal with this cruft? Surely at some point it's easier just to throw it out and start again?

Reality says no. Joel gave the key advice in the article linked above (go read it if you haven't already -- seriously), and it's strongly hinted in my article title. Don't revolt; instead, evolve your codebase. Tweak it, evolve it, move it slowly (and compatibly) towards your utopian ideal of perfect code. You won't get there, but you'll almost certainly get closer by gradual and continuous improvement than you would be going back to the start line and walking in a different direction.

There's one big question that's nagging at me, though -- what happens when you want to switch languages? I'm loving Ruby on Rails at the moment, and so are several other people at work, but we've got (both personally and professionally) huge swathes of code in PHP (and other second-rate languages) that needs to be dealt with. It's a lot harder to evolve code into a different language than apply a gradual degunking.

I have a number of ideas on this point, but I'd love to hear your ideas too. In a few months, I'll have a lot better idea of what works and what doesn't in massive PHP to Rails conversions, and I'm sure everyone out there in the blogosphere will hear about them in due time.


Post a comment

All comments are held for moderation; markdown formatting accepted.

This is a honeypot form. Do not use this form unless you want to get your IP address blacklisted. Use the second form below for comments.
Name: (required)
E-mail: (required, not published)
Website: (optional)
Name: (required)
E-mail: (required, not published)
Website: (optional)