Sunday, February 27, 2011

Running branches for continuous publishing

I am a very strong proponent of what are called running branches for development of software, and for the stabilization and publication of online games. One of the more important features of large scale online games is that they live a long time, and have new content, bug fixes and new features added over time. It is very difficult to manage that much change with a relatively large amount of code and content. And since you continue to develop more after any release, you will want your developers to be able to continue working on the next release while the current one is still baking in QA, and rolling toward production.

I will skip the obvious first step of making the argument that version control systems (aka source code change control, revision control) are a good idea. I like Perforce. It has some nice performance advantages over Subversion for large projects, and has recently incorporated ease of use features like shelving and sandbox development. I like to call the main line of development mainline. I also like to talk about the process of cutting a release and deploying it into production as a "train". It makes you think about a long slow moving object that is really hard to stop, and really difficult to add things to and practically impossible to pull out and pass. And if you get in the way, it will run you down, and someone will lose a leg. Plus it helps with my analogy of mainline and branch lines.

So imagine you are preparing your first release. You make a build called Release Candidate 1 (RC1), and hand it off to QA. You don't want your developers to go idle, so you have two choices, they can pitch in on testing, or they can start working on release 2. You will probably do a bit of each, especially early in the release cycle, since you often dig up some obvious bugs, and can keep all your developers busy fixing those. But at some point they will start peeling off and need something to do. So you sic them on Release 2 features, and they start checking in code.

Then you find a bug. A real showstopper. It takes a day to find and fix. Then you do another build and you have RC1.1. But you don't want any code from Release 2 that has been being checked in for several days. It has new features you don't want to release, and has probably introduced bugs of its own. So you want to use your change control system to make a branch. And this is where the philosophy starts. You either make a new branch for every release, or you make a single Release Candidate branch and for each release, branch on top of it.

Being prepared ahead of time for branching can really save you time, and confusion, especially during the high stress periods of pushing a release, or making a hotfix to production. So I'm really allergic to retroactive branching, where you only make a branch if you find a bug and have to go back a patch something.

Here's why: the build system has to understand where this code is coming from, or you will be doing a lot manual changes right when things are the most stressed. If you have already decided to make branches, you will also have your build system prepared and tested to know how to build off the branch. You will also have solved little problems like how to name versions, prepare unambiguous version strings so you can track back from a build to the source it came from, and many more little surprises.

The build system is another reason why I prefer running branches as opposed to a new branch per release. You don't have to change any build configuration when a new release comes along. The code for RC2 is going to be in exactly the same place as RC1. You just hit the build button. That kind of automation and repeatability is key to avoiding "little" mistakes. Like accidentally shipping the DB schema from last release, or wasting time testing the old level up mechanism, or missing the new mission descriptions.

And then there is the aesthetic reason. If you cut a branch for every release, your source control depot is going to start looking pretty ugly. You are planning on continuous release, right? Every month. After 5 years that would be 60 complete copies of the source tree. Why not just 2: ML and RC (and maybe LIVE, but let's save that for another time).

Finally, as a developer, if you are lucky enough to be the one making the hotfix, you will want to get a copy of the branch onto your machine. Do you really want another full copy for each release that comes along? Or do you just want to do an update to the one RC branch you've prepared ahead of time? It sure makes it easier to switch back and forth.

An aside about labels: You might argue you could label the code than went into a particular build, and that is a good thing. But one problem with labels that has always made me very nervous is that labels themselves are not change controlled. Someone might move a label to a different version of a file, or accidentally delete it or reuse it, and then you would lose all record of what actually went into a build. You can't do that with a branch. And if you tried, you would at least have the change control records to undo it.

One more minor thought: if you want to compare all the stuff that changed between RC1 and RC2, it is much easier to do in a running branch. You simply look at the file history on the RC branch and see what new stuff came in. To do that when using a branch per release requires a custom diff each time you want to know: e.g. drag a file from one branch onto the same file on the other. Pretty clumsy.

Also note that these arguments don't apply as well for a product that has multiple versions shipped and in the wild simultaneously. An online game pretty universally replaces the previous version with the new one at some point in time. The concurrency of their existence is only during the release process.

Summary:
  • You want to branch so you can stabilize without stopping ongoing work for the next release
  • You want a branch so you are ready to make hot fixes
  • You want a running branch so your build system doesn't have to get all fancy, and so your repo looks simpler.


I may revisit the topic of branching in the form of sandbox development which is useful for research projects and sharing between developers without polluting the mainline.