Thursday, September 18, 2008

Forced immediate migration is nasty

Let's say we have a solution that somehow doesn't suffer from poor decomposition of space (too many small pieces, or no way to break the load into balanceable pieces -- i.e. having overloads). There is still another very difficult technical problem. When an Entity just crosses a geometric boundary, some of these systems will require the Entity to migrate onto the new host *immediately*. That is because there are assumptions built in elsewhere about where to look for an Entity, or how far an Entity can see/be seen.

The problem with a forced immediate migration is that an Entity might be working on something tricky just then. Some use cases and consequences:
  • Engaged in battle; delays/hitches during marshal/transmit/unmarshal during one of the most critical gameplay experiences
  • Running a script; how to pack up an in-flight Lua or Python script? Turns out Stackless Python supports in-flight script pickling. Another option is to write your own language, and build pickling in your Virtual Machine. Pretty complicated, and possibly slow. What about temporary or global variables; external references. I believe Eve does this, but I'm not sure if they do in-flight migration.
  • Being persisted to the DB, running a financial transaction; anything considered critical and fault sensitive should not be made even more complex by injecting a synchronous but distributed action. You are asking for deadlocks and race conditions in what is by definition the most critical aspect of the system.
There are two possible solutions that both allow a simpler migration system, but as a very beneficial side-effect allow Entities to interact across host boundaries:
  • Do everything event-oriented. This means that there are never any Behaviors outstanding at the end of ticking an Entity. When the migration service runs, each Entity has become just a set of Properties. The problem with this is that content developers find event oriented programming confusing and complicated. They have to explicitly manage a logical context (what is this Entity doing over a period of several events?), or fall back to a state-machine mechanism that adds a different kind of complexity (and more tools). To make it worse, you can still have race conditions (who opened that chest first, vs. who pulled out the loot?).
  • Don't migrate immediately. I think this is the silver bullet, and it is possible to realize. Even more interestingly, it is possible to *never* migrate, and that opens the door to using Entity Behavior technology that is not possible to migrate (e.g. C/C++ running on a Posix thread with pointers hanging out all over the place; computation that is only sensible to run on certain kinds of hardware). And the thought of not paying a migration performance penalty is kind of tantalizing. I'm going too far. In practice you would want to do the migration; there are tons of benefits.
Again, I'm going to make you wait a bit longer before I tell you how all this can work.

No comments:

Post a Comment