- large numbers of simulators
- an ability to load balance based on load (not just geography)
- an authoritative simulator to avoid DB bottlenecks
- a single-write paradigm to avoid overly complex synchronization
Setting aside the policy and impetus for initiating an Entity migration, lets talk about the mechanics. By separating policy and mechanism, we can experiment or customize the policy to use application-specific information, resulting in a closer to optimal solution. And we won't have to reimplement the mechanism each time.
We know we can run an Entity on any host by using interest management as discussed earlier to feed an Entity everything it needs to operate correctly. So all we really need to realize a migration is:
- getting the Entity state onto the new host
- getting the data flowing to that host that is needed by that Entity
- doing this quickly enough that there are no hiccups visible to the players
- avoid all ordering and race conditions so there is no game logic difference compared to not migrating (no side effects)
- survive crashes of any component at the worst possible moment (i.e. preserve important transactionality) without significant impact to the players
There is a handshake needed to get this state across:
- suspend further execution of the entity so things don't change during the migration
- transmit the state
- recreate the entity on the target host
- resume execution of the entity.
So we need to add some steps:
- Get the target subscriptions set up and acknowledged before the transfer so when the Entity arrives, all data is available there that it had in its original location
- Have the original simulator "flush" its DB queue so the DB never sees out of order persistence requests, and then stop persisting that entity until after the migration.
- increment an "epoch" counter to allow us to discard any replication messages or requests from the past
- Given the increase in time and complexity, it may be worth optimizing the process by pre-loading the target host without actually pausing the original entity. Then once everything is set up, resend states that may have changed during the preload. Of course, you might also make use of any state that was previously replicated to the target.
One of the coolest features of interest management is that you can choose to not migrate and the game still runs the same (but may use more datacenter-only networking). So if you can't migrate an entity until it finishes a behavior (because you can't migrate your stack), no problem, just wait. Program that into your policy. If you find that hitches are visible but only when a player is in a heavy combat situation, your policy can delay initiating the migration until the participants have been quiescent for a while.