Sunday, October 19, 2008

Back to First Principles: Interest Management

When you think of an Entity needing to interact with its environment, you don't tend to think about arbitrary lines running through the geometry. (There is no colored line on the ground between countries). In fact, players delight in trying to find those exceptions and do things while standing on either side, or jumping back and forth as fast as possible.

The way to think of the problem is that geometric decomposition is solely to support load balancing. Stuff on this side runs on this host, stuff on that side runs on that host. Much of the rest of the system just takes advantage of that assumption. (And it is not such a great assumption.)

But what if we ignore load balancing, and just think of the Entities all over the place trying to interact? At the extreme, each Entity would be on its own host. Now we have classical distributed systems problem, and can tap into that knowledge.

Distributed object technologies, like CORBA, hide the fact that some objects are remote by using a local smart-proxy. Interactions by a locally owned/executed Entity with the local proxy are forwarded to the remote-original object. The big problem here is that CORBA can block the requestor, and the request has a round-trip latency.

The better way to solve this is to ensure the local proxy is already up to date before the local Entity starts interacting. This allows the proxy's state to be as accurate as physically possible (the local proxy is at most out of date by a one-way network latency unit of time).

Now we have to solve the interest management problem. The system wouldn't scale if we broadcast Entity updates (in both network consumption and in space for storing the proxies). Here we rely on a few restrictions that we think are not too onerous. The Entity must declare what it is interested in, and it must never write directly to a local proxy.

The simplest interest management approach is to break the world into tiles. If an Entity can see into a tile at all, it is interested in all of that tile. If another Entity is currently located in that tile, it publishes its state updates to that tile. Using a publish-subscribe communication mechanism, all interested Entities consume every Entity's state that they can see. (There are much more interesting interest management approaches we will discuss later).

The result is, we don't have the nasty load balancing problems of other systems. The host on which an Entity is running doesn't matter. Two Entities can interact with each other no matter where they are hosted. And the simulation operates the same way it would in other systems.

The only remaining technical challenge is building a publish/subscribe system that is reliable and efficient.