Thursday, May 21, 2009

Adaptive/Dynamic Categorization

Any static partitioning of data flows using a static algorithm (like a geometric grid) will break down. Either due to something intrinsic, or because the players are evil, and delight in ignoring your simplifying assumptions.

So for stability and performance/scale reasons a categorization policy that is dynamic will be better. Periodically a centralized or distributed computation can inspect the in-use categories and determine whether there is a better mapping.

Imagine there was a flash-crowd come to see an epic battle. The initial mapping had that area mapped to one category. The area can be split into two area and given two distinct categories. At a specific point in time, the new mapping will be instituted. Call that the next epoch. We need to do that across all users of those categories. The easiest way would be to broadcast the new mapping. Once all participants have acknowledged they have the new mapping and re-subscribe using that mapping, producers can start using the new categories. Once all participants have acknowledged adopting the new mapping, the old categories can be unsubscribed from.

There are many interesting dynamic categorization policies. They are based on live measurement of the system as the game is being played. The new categorization is then applied live. Clearly this can provide better efficiency than offline analysis and updating a static mapping, since in-game patterns can change minute to minute.
  • Split and merge geometric grid cells or triangles. This is how many mmo systems work today, so if that is your preference you can keep it while migrating to using categories.
  • Compute "nearness" between producers and consumers, and identify clusters of entities. The set of entities identified is given a category. As entities move around, they subscribe to other clusters of entities. Once the interconnections become too "stretched", a new clustering can be computed. There are pretty simple clustering algorithms, such as inspecting R-squared for an Entity against various proposed clusters to find the nearest one inside some threshold. Note that this approach can be applied to individual entities incrementally as they move.
  • A similar clustering algorithm can be applied to actual communication patterns as opposed to assuming that geometric nearness always indicates high interaction rates. The recategorization system would record and analyze metrics of interactions as they change in real time.
  • The hosting software can detect overloads of various resources, and indicate to the recategorization system that splitting off instances of busy areas would be valuable (assuming your game design supports that approach).
  • As dungeons or PvP arenas are spawned, a new set of categories especially for the players involved will be computed and used. Because the subscribers indicate which instance they are currently assigned to, the categorization policy will compute the Categories for that instance, and a player will not see anything from a different instance.
  • A game may have unique capabilities like being able to add, remove or move land masses. As coordinate systems drift around, the categorization policy would need to take this into account. It may be able to do that using relative coordinates, or it may need to completely recategorize space.
  • Changes in team or guild size may trigger recategorization.
  • As a clarification, in contrast to dynamic categorization, consider this scenario: As density increases in an area, the policy can adjust the maximum sight distance. While this doesn't directly change the decomposition of the data, it will result in fewer categories being consumed. (This is really a subscription policy change, not a categorization policy change)
Really, the key thing is to consider that real time feedback can be used to help improve the data distribution performance of your system.

Obviously an algorithmic policy is better than an enumerated one, since distributing new parameters to factor into a computation requires less data than a complete enumeration of the descriptions for each category.

Like almost any distributed computation, recategorization can benefit from incremental changes. Local measurements and performance triggers can instigate a categorization change to just a few mappings, and may be able to be distributed to only a few machines. These incremental changes can quickly ripple through the category space and come to an approximately optimal mapping. One benefit of the incremental approach over the periodic approach is that the trigger to recompute can be an event as opposed to a polling of metrics and a recomputation that may have limited or no effect.

No comments:

Post a Comment