Monday, October 25, 2010

Architect it for Horizontal DB Scalability

The performance of the database in an MMO tends to be the most common limiting factor in determining how many simultaneous players an individual Cluster can support. Beyond that number, the common approach is to start creating exact copies of the Cluster, and call them Shards or Realms. The big complaint about sharding is that two friends may not be able to play together if their stuff happens to be on different shards. Behind the scenes, what is going on is that their stuff is in separate database instances. Each shard only accesses one DB instance because the DB engine can only handle so much load.

There are a couple of approaches that can almost completely get rid of these limits, both of which depend on creating many DB instances, and routing requests to the right instance

The first is an "automated" version of a character transfer. When a player logs in, they are assigned to an arbitrary cluster of machines, and all their stuff is transferred to the DB for that cluster. The player has no idea which cluster they were connected to and they don't have names. There are a couple of problems with this:
1) you can't do this if you want the state of your world around the player to be dynamic and persist; the stuff you throw on the ground; whether that bridge or city has been burned. A player would be confused if each time they logged in, the dynamic state of the world was different. This isn't all that common these days, however. Interesting problem, though.
2) the player might not be able to find their friends. "Hey I'm beside the fountain that looks like a banana-slug. Yeah, me too. Well, I can't see you!"
You might be able to deal with this by automatically reconnecting and transferring the friends to the same cluster and DB, but that gets tricky. In the worst case, you might wind up transferring *everyone* to the same cluster if they are all friendly.

Another approach provides horizontal scalability and is one that doesn't assume anything about how you shard your world, do dungeon instancing, what DB engine you use, or many other complications. That is a nice property, and makes the DB system loosely coupled, and useful across a broad spectrum of large scale persistent online games.

What dynamic data are you persisting anyway? The stuff you want to come back after a power-failure. Most games these days only care about the state of the player, and their stuff. They may have multiple characters, lots of data properties, inventory of items, skills learned, quests in progress, friend relationships, ... If you sort through all your data, you'll find that you have "tuning" or design data that is pretty much static between releases. And you have data that is "owned" by a player.

To a DB programmer, that fact means that the bulk of the data can be indexed by the player_id for at least part of its primary key. So here is the obvious trick:
Put all data that belongs to a given player into a single DB. It doesn't matter which DB or how many there are. You have thousands or millions of players. Spreading them horizontally across a large number of DB instances is trivial. You could use modulus (even player_ids to the left, odd to the right). Better would be to put the first 100,000 players in the first DB, then the next 100,000 into a second. As your game gets more successful, you add new DB instances.

A couple of simple problems to solve:
1) you have to find the right DB. Given that any interaction a player has is with a game server (not the DB directly), your server code can compute which DB to consult based on the player_id it is currently processing. Once it decides, it will use an existing connection to the correct DB. (It is hard to imagine a situation where you would need to maintain connections to 100 DB instances, but that isn't really a very large number of file descriptors in any case.)
2) If players interact and exchange items, or perform some sort of "transaction", you have to co-persist both sides of the transaction, and the two players might be on different DB instances. It is easy to solve the transactional exchange of items using an escrow system. A third party "manager" takes ownership of the articles in the trade from both parties. Only when that step is complete, will the escrow object give the articles back to the other parties. The escrow object is persisted as necessary, and can pick up the transaction after a failure. The performance of this system is not great. But this kind of interaction should be rare. You could do a lot of this sort of trade through an auction house, or in-game email where ownership of an item is removed from a player and their DB and transferred to a whole different system.
3) High-speed exchange of stuff like hit points, or buffs, doesn't seem like the kind of thing players would care about if there was a catastrophic server failure. They care about whether they still have the sword-of-uberness, but not whether they are at full health after a server restart.

Some people might consider functional-decomposition to get better DB performance. E.g. split the DB's by their function: eCommerce, inventory, player state, quest state, ... But that only gets you maybe 10 or 12 instances. And the inventory DB will have half the load, making 90% of the rest of the hardware a waste of money. On the other hand, splitting the DB with data-decomposition (across player_id), you get great parallelism, and scale up the cost of your hardware linearly to the number of players that are playing. And paying.

Another cool thing about this approach is that you don't have to use expensive DB tech, nor expensive DB hardware. You don't have to do fancy master-master replication. Make use of the application knowledge that player to player interaction is relatively rare, so you don't need transactionality on every request. Avoid that hard problem. It costs money and time to solve.

There is a phrase I heard from a great mentor thirty years ago: "embarrassingly parallel". You have an incredible number of players attached, and most actions that need to be persisted are entirely independent. Embarrassing, isn't it?

Now your only problem is how big to make the rest of the cluster around this monster DB. Where is the next bottleneck? I'll venture to say it is the game design. How many players do you really want all stuffed into one back alley or ale house? And how much content can your team produce? If you admit that you have a fixed volume of content, and a maximum playable density of players, what then?

1 comment:

  1. This is a great article, thanks for putting it together. You are spot on regarding how to shard by player, we have done this in gaming and other types of apps too. We also can do some clever things for Parallel Query (we call it Go Fish) to bring back a single result set from queries run in parallel across a number of shards.

    Where you do run into limits with this approach is the actual number of physical connections and network bottlenecks -- but there are ways around that too.

    This paper has some more useful info on the topic: