Writing ad hoc threaded code is almost always more difficult than it is worth. You might be able to get it to work, but the next guy in your code is probably going to break it by accident, or find it so hard to deal with, they will want to rewrite it. So you want some kind of simplifying policy about how to develop threaded code. It needs to perform well (the primary reason for doing threading in the first place, other than dealing with blocking), require little mental overhead or boiler plate code, and ideally be able to be used by more junior developers than yourself.
Functional parallelism only extracts a small fraction of available performance, at least in larger, "interesting" problems. Data parallelism is often better at extracting more performance. In the case of large online game servers, the ideal thing to parallelize across are game entities. There are other types of entities that we can generalize to, as well, once this is working.
Here's what I've come up with...
Generally, you want to avoid ticking entities as fast as possible. While this makes a certain amount of sense on the client (one tick per graphics frame) to provide the smoothest visual experience, on the server it will consume 100% of the CPU without significantly improving the player's experience. One big downside is that you can't tell when you need to add more CPUs to your cluster. So I prefer event driven models of computing. Schedule an event to occur when there is something necessary to do. In some cases, you may want to have periodic events, but decide what rate is appropriate, and schedule only those. In this way, you can easily see the intrinsic load on a machine rise as more work (entities) are being handled. So the core assumption is: we have a collection of entities, handling a stream of events.
We built an event dispatcher. It is responsible for distributing events (that usually arrive as messages) to their target entities. I've discussed a number of means to route those messages and events in other posts. When an event is available, the system has the entity consume the event, and runs an event handler designed for that type of event.
You don't want to spawn a thread per entity. There could be thousands of entities per processor, and that would cause inefficient context switching, and bloat memory use for all those threads sitting around when only a few can run at a time. You also don't want to create and destroy threads every time an entity is created or destroyed. Instead, you want to be in control of the number of threads spawned regardless of the workload. That allows you to tune the number of threads for maximum performance and adjust to the current state of affairs in the data center. You create a thread pool ahead of time, then map the work to those threads.
It would be easy enough to use an Executor (java), and Runnables. But this leads to an unwanted problem. If there are multiple events scheduled for a single entity, this naive arrangement might map two events for the same target to different threads. Consequently, you would have to put in a full set of concurrency controls, guarding all data structures that were shared, including the entity itself. This would counteract the effect of running on two threads, and make the arrangement worthless.
Instead, what I did was to create a separate event queue per entity, and make a custom event dispatcher that pulled only a single event off the queue. The Runnable and Executor in this better set up are managing entities (ones with one or more events on their private queues). Any entity can be chosen to run, but each entity will only work on a single event at a time. This makes more efficient use of the threads in the pool (no blocking between them), and obviates the need for concurrency control (other than in the queues). Event handlers can now be written as if they are single threaded. This makes game logic development easier for more junior programmers, and for those less familiar with server code or with Java (e.g. client programmers).
Obviously, this applies to data that is owned by a single entity. Anything shared between entities still needs to be guarded. In general, you want to avoid that kind of sharing between entities, especially if you have a large scale game where entities might be mapped across multiple machines. You can't share data between two entities if they are in different processes on different machines (well there is one way, but it requires fancy hardware or device drivers, but that is another story). So you will be building for the distributed case anyway. Why not do it the same way when the entities are local?
I found a number of posts about executors and event dispatching that touched on these ideas, but there was nothing official, and there seemed to be a lot of debate about good ways to do it. I'm here to say it worked great. I'd love to post the code for this. Again, maybe when Quantum is available, you'll see some of this.