Sunday, December 20, 2009

(Real) Science IS political, sorry to disillusion you

There is the big "climate gate" hoo-haw in the media right now. Reporters are acting surprised that some leading scientists were caught manipulating scientific literature to silence skeptics and dissenters. They convinced peers to knock skeptics' articles, journals to reject skeptics' papers, remove peers from paper review committees if they passed skeptics' papers, and even shut down journals that published dissenting views. Maybe even fudged their data.

Hey! Science is supposed to be awesome. It always eventually gets it right. Real science is about reproducible experiments and validated results. Actually, no, it is political. Like most other human endeavors.

  • Galileo and other historical scientists were shut down by their community. Granted, their scientific peers were not basing their views on empirical data. But many modern scientists still base their views on what they've been taught, not what they've measured. This is understandable, you have to stand on the shoulders of giants to have time to make advances.
  • Views are often validated by currently understood standards. If you plot the published speed of light against the year of publication, you will see sequences of flat spots where a value is almost identical to what was previously "measured". And then there is a jump; followed by another flat period of many years. Is this because they were using the same equipment and experimental procedure? Or because anyone that tried to publish a different "answer" was considered a skeptic and shut down? Again, this is understandable. Humans tend to try to be consistent, and not be antisocial and go against the crowd. It requires extra diligence to disprove a well respected master in their field. Like the great Einstein when quantum physics popped up. (Wait! Maybe God changes the real speed of light periodically as a joke!)
  • Scientists are funded. Based on whether they get published or referenced. Or if they agree with the "sponsoring" corporation. So the system manipulates them into agreeing with the crowd. But the underbelly of the scientific community is less pretty than what we might have thought of as this kind of indirect pressure.
  • Journals are funded. If the larger community of scientists don't subscribe to that journal offers, or schools or corporations pull their funding (or advertising!), a venue of dissent/questioning dies.
  • I've seen public "attacks" during a presentation of research. In the form of a "question". Being honest, these questions are self-aggrandizing. E.g. "what makes you think that you are right when I have already published the opposite". It can embarrass a young scientist and discourage them from disagreeing in the future. Only the thick-skinned "crazies" keep at it. Like Tesla.
  • I've been on paper review committees where papers are summarily discarded. There are so many that only one or two of the reviewers are assigned to read a given paper before the meeting. If they didn't understand it, or disagreed with the findings (based on their own experience/bias), it can get tossed very quickly. There are a lot to get through. Even when it is "marginal", the shepherding process can be taxing, discouraging a reviewer from volunteering. After all, they are contributing their expertise, but don't get their name on the paper. (Suggestion: maybe they should. If they pass something that proves incorrect, they lose points. As an incentive to get it right. Or would that discourage participation?). For "workshops" (not full Journals) an author's name is sometimes on the paper being reviewed, so their reputation is considered.
  • As science get more "fine", some experiments cannot be reproduced except on the original equipment or by the original experts. Think CERN and supercolliders. (Or cold fusion?) Who has another billion dollars just to *reproduce* a result? Unless the larger community thinks a result is hogwash and feels motivated to pool their resources and dump on the results. So who is going to disagree? It almost sounds like a religion at that point.

I bet you've done the same things. Maybe at work. Tried to shut down "the competition". Competing for attention, or a raise, or recognition... You know what would be better? Listen to those that disagree and make sure they know you have heard them. They are trying to make the best decisions they can given their background. No one tries to make dumb decisions. If they are wrong, I'm sure they would appreciate learning something they don't know. Or, maybe they have something to teach you.

Imagine! Learning something from someone that disagrees with you and you find irritating. A dissenter. A skeptic. Seems like those that shut down dissent are not just closed minded, but unwilling to learn. Such a scientist should be embarrassed for themselves. Isn't IDEAL science supposed to be about discovery? Too bad that in reality there is so little ideal science, and so much science influenced by the politics of "real-life science".

Tuesday, December 8, 2009

Data Driven Entities and Rapid Iteration

It is clearly more difficult to develop content for an online game than a single player game. (For one, sometime entities you want to interact with aren't in your process). So starting with the right techniques and philosophies is critical. Then you need to add tools, a little magic and shake.

There are several hard problems you hit when developing and debugging an online game:
  • Getting the game to fail at all. A lot of times bugs are timing related. Of course, once you ship it, to players it will seem like it happens all the time.
  • Getting the same failure to happen twice is really hard. E.g. if the problem is caused by multiplayer interaction, how are you going to get all players or testers to redo exactly the same thing? And in the spirit of Hiesenbugs, if you attach a debugger or add logging, good luck getting it to fail under those new conditions
  • Testing a fix is really hard, because you want to get the big distributed system back into the original state and test your fix. Did you happen to snapshot that state?
  • Starting up a game can take a long time (content loading). Starting an online game takes even longer because it also includes deployment to multiple machines, remote restarting, loading some entities from a DB, logging in, ...
  • If you are a novice content developer plagued by such a bug or a guy in QA trying to create repro steps to go along with the bug report, it will probably end badly.
Consequently, what do you need to do to make things palatable?
  • Don't recompile your application after a "change". Doing that leads (on multiple machines) to shutdown, deploy, restart, "rewind" to the failure point. You'd like to have edit and continue of some sort. To do that, almost certainly you'd need a scripted language (or at least one that does just in time compilation, and understands edit and continue).
  • Don't even restart your application. Even if you can avoid recompilation, it can take a loooong time to load up all the game assets. Especially early in production, your pipeline may not support optimized assets (e.g. packed files). For a persistent world, there can be an awful lot of entities stored in the database to load and recreate. Especially if you are working against a live or a snapshot of a live shard. At the very least, only load the assets you need.
  • Yes, I'm talking about rapid iteration and hot loading. When you can limit most changes to data, there is no reason you can't leave everything running, and just load the assets that changed. In some cases when things change enough you might have to "reset" by clearing all assets from memory and reloading, but at least you didn't have to restart the server
  • Rapid iteration is particularly fun on consoles, which often have painfully long deployment steps. Bear in mind that you don't have an editor in-game on a console, it is too clumsy. So the content you see in your editor on your workstation is just a "preview". You would swivel-chair to the console to see what it really looks like on-target.
  • Try to make "everything" data driven. For example, you can specify the properties and behaviors of your entities in a tool and use a "bag of properties" kind of Entity in most cases. After things have settled down, you can optimize the most used ones, but during content development, there is a huge win to making things data-driven. Of course, there is nothing stopping you from doing both at once.
  • Another benefit of a data-driven definition of an Entity is that it is so much easier to extract information needed to automatically create a database schema. Wouldn't you rather build a tool to do this than to teach your content developers how to write SQL?
Don't forget that most of the time and money in game development pours through the content editor GUIs. The more efficient you can make that, the more great content will appear. If you want to hire the best content developers, make your content development environment better than anyone else's.

Thursday, November 19, 2009

That's not an MMO, its a...

Some people call almost anything "big" an MMO. Is facebook an MMO? Is a chess ladder an MMO? Is Pogo? Not to me.

What about iMob and all the art-swapped stuff by The Godfather on the iPhone? You have an account with one character. Your level/score persists, money, and which buttons you've press on the "quest" screen. As much as they want to call that an MMO, it is something else. Is it an RPG? Well, there is a character. But you don't even get to see him. Or are you a her?

These super-light-weight online games are not technically challenging. You can build one out of web-tech or some other transaction server. If you are all into optimizing a problem that scalable web services solved years ago, cool. Your company probably even makes money. But it doesn't push the envelope. Someone is going to eat your lunch.

Maybe I should have called this blog "Interesting Online Game Technologies".

Me? I want to build systems that let studios build the "hard" MMO's, like large seamless worlds. I don't want a landscape that only has WoW. If that tech were already available, we'd be seeing much lower budgets, more experimentation, games that are more fun, lower subscription fees, more diversity, and better content. All at the same time. I certainly don't want to build the underlying infrastructure over and over.

Of course, I'd love it if the tech solved all the hard problems so my team could build something truly advanced while still landing the funding. Unfortunately, today, people have to compromise. But maybe not for long.

Reblog this post [with Zemanta]

Friday, August 21, 2009

What is software architecture (in 15 pages or less)?

Kruchten's "4+1 views of software archicture" is one of my favorite papers of all time. It shows four views of software architecture (logical/functional, process, implementation/development, and physical). The plus one is use-cases/scenarios.

Being 15 pages long, it is an incredibly efficient use of your time if you want to disentangle the different aspects of designing complex systems. And it gives you terminology for explaining what view of the system you mean, and which you have temporarily abstracted away.

Don't get hung up on the UML terminology:
  • Logical is what is commonly refered to as a bubble diagram. What components are in your system and what are they responsible for?
  • Process is which OS processes/threads will be running and how they communicate.
  • Implementation is the files and libraries you used to build the system.
  • Physical is your hardware and where you decided to map your processes.
http://en.wikipedia.org/wiki/4%2B1_Architectural_View_Model
I prefer the original paper: http://www.cs.ubc.ca/~gregor/teaching/papers/4+1view-architecture.pdf

Monday, August 10, 2009

Fail over is actually kind of hard (and expensive)

You are building a peer to peer or peered server small scale online game. There are players that purposely disconnect as soon as someone start beating them. They think their meta score will be better, or whatever. Or maybe they have a crappy internet connection. In any case, the master simulator/session host goes down abruptly; now what?

At least you would want to keep playing with the guys you've matched with; the rest of your clan or whatever. At best, you'd like to keep playing from exactly the point of the "failure" without noticing any hiccup (good luck). The steps to get session host migration/fail over to a new simulator:
  • Restablish network interconnection
  • Pick a new master simulator
  • Coordinate with the Live Service about who is the master (or do this first)
  • Have the entity data already there, or somehow gather it
  • Convert replicated entities in to real owned/master entities
  • Reestablish interest subscriptions, or whatever distributed configuration settings are needed
  • "Unpause"
Some challenges:
  • To reconnect, someone needs the complete list of IP/ports used by the players. But that is consider a security issue. E.g. someone could use that info to DOS attack an opponent. Let's assume the Live Service renegotiates and handshakes your way back into business.
  • If you aren't connected, how do you elect a new master? If you don't have a master yet, how would all the clients (in a strict client/server network topology) know who to connect to? So the answer has to be precomputed. E.g. designate a backup simulator before there is a fault (maybe the earliest joiner, lowest ip address...)
  • If your game session service supports this, it can solve both of the previous issues by exposing IP addresses only to the master simulator, and since it has a fixed network address, each client can always make a connection to it and be told who is the new master.
  • If the authoritative data is lost on the fault, you may as well restart the level, go back to the lobby or whatever. So instead, you have to send the entity state data to the backup simulator(s) as you go. This is actually more data than is necessary to exchange for an online game that is not fault tolerant, since you'd have to send hidden and possibly temporary data for master Entities. Otherwise you couldn't completely reconstruct the dead Entities. There may be data that only existed on the master, so gathering it from the remaining clients isn't going to be a great solution. Spreading the responsibility is that much more complicated.
  • So the backup master starts converting replicated Entities into authoritative Entities. Any Entities it didn't know about couldn't get recreated, so the backup master has to have a full set of Entities. Think about the bandwidth of that. You should really want this feature before just building it. Now we hit a hard problem. If the Entities being recreated had in-flight Behaviors (e.g. you were using coroutines to model behavior), they can't be reconstructed. It is prohibitively expensive to continuously replicate the Behavior execution context. So you wind up "resetting" the Entities, and hoping their OnRecreate behavior can get it running again. You may have a self-driven Behavior that reschedules itself periodically. Something has to restart that sequence. Another thing to worry about: did the backup simulator have a truly-consistent image of the entity states, or was anything missing or out of order? At best this is an approximation of the state on the original session host.
  • Unless you are broadcasting all state everywhere, you are going to have to redo interest management subscriptions to realize bandwidth limitation. This is like a whole bunch of late-joining clients coming in. They would get a new copy of each entity state. Big flurry of messages, especially if you do this naively.
  • Now you are ready to go. Notify the players, give them a count-down...FIRE!
What did we forget? What defines "dropped out"; a maximum unresponsiveness time? What if the "dead" simulator comes back right then? What if the network "partitioned"? Would you restart two replacement sessions? Do you deal with simultaneous dropouts (have you ever logged out when the server went down? I have.)?

Note that the problem gets a lot easier if all you support is clean handoff from a running master to the new master. Would that be good enough for your game.

So is it worth the complexity, the continuous extra bandwidth and load on the backup simulator? Just to get an approximate recreation? With enough work, and game design tweaking, you could probably get something acceptable. Maybe give everyone a flash-bang to mask any error.

Or maybe you just reset the level, or go back to the lobby to vote on the next map. And put the bad player on your ban list.

Me? I'd probably instead invest the time of my network and simulator guys in something else, like smoothness, fun gameplay, voice, performance. Or ship earlier.

Thursday, July 30, 2009

Incremental release vs. sequel

(Warning: plenty of irony below, as usual)

There was a time when the MMO developer community thought that the ideal was to stand up your world, and then start feeding the dragon. As quickly as possible, get new content into the players hands. The more new content, the more fascinated they would be, the stickier their subscriptions would be and the more money you would make.

So we put a lot of effort into techniques to manage continuous development of content, test it, and roll it out with the minimum possible maintenance window. Some got good enough they could release content or patches every week. Didn't we have automated client patchers? Why not use those to continuously deliver content. Not just streaming content as you move around in virtual space, but as you move forward in real time.

Then someone noticed that Walmart took their game box off the shelf because it had been there for a year, and new titles were showing up. Surely consumers want the new stuff more? Besides, you don't need the latest release, you get all the new stuff when you patch. Then new subscriptions drop because of that lack of visibility at retail. Why would Gamestop sell prepaid cards if it is so easy to pay online?

So the light goes bing, and it is suddenly obvious that sequels would be a much better approach, since you'll get shelf space if it is a new SKU. Clearly this is the best approach, since it works so well for Blizzard. (So clearly, I cannot choose the glass in front of you. Where was I?) All you have to do is patch a few bugs, and set up a parallel team to work on the sequel.

But then everyone piles onto Steam. Definitely the end of brick and mortar. Maybe we go back to the low-latency content pipeline so our game is fresher than the sequel-only guys. But wait, Steam sales and free trials increase traffic at Gamestop.

Clearly, I cannot choose the glass in front of me... Wait til I get started.

Not really. As you can see, the point is that technologists are very unlikely to see the future of sales and distribution mechanisms. And if we did, it would take a year to adapt our development practices and product design to take optimal advantage of it.

The answer? Be flexible. Don't assume you've got the one and only magic bullet. Requirements change. And for an MMO the development timespan is large enough that a lot of things will change before you are finished. Don't implement your company into a corner with a big complex optimal single point solution, and keep your mind open.

Monday, July 13, 2009

Web tech for "game services"

I hold the opinion that every disagreement is a matter of different axioms, values or definitions.

I believe definitions is what is going on with this post by "Kressilac" (Derek Licciardi?):
http://blogs.elysianonline.com/blogs/derek/archive/2009/05/29/6400.aspx I'd guess we do hold the same values.

Derek argues that portions of an MMO server are suited to using and best implemented using web technology. I absolutely agree. I call these parts of the system "Game Services". Most would be accessed directly from the client. Examples:
  • profanity filtering,
  • shard status, open, full, down, locked, capped
  • in game search/player online,
  • clan/guild management,
  • item trade,
  • auction,
  • voting/elections,
  • chat,
  • match making/lobby,
  • leaderboards,
  • persistent messages/email,
  • reputation management/community services,
  • in-game advertising
  • Search,
  • authentication,
  • CSR account locking
  • patching, streaming patching
  • microtransactions
  • petitions
  • custom content
  • character annotation, friend lists,
  • knowledge base
  • voice chat
  • Maybe: inventory, quests, crafting (touches in-game entities)
Anyone got more for this list?

Most of these systems are "decorative" and are for the community aspects of the game.

The complication arises where the data managed by these services is affected by or used by the simulator (I.e. in-game logic). E.g. the number of members of your clan changes Mana recharge rate. I'd suggest that most of those kind of communications are not critical to be transactional or latency critical or can have the game design bent to accommodate that restriction.

There are a couple of those game services (especially those interacting with items) that are entangled. The easiest way to deal with those is to transfer ownership of the Entities in question to one system or the other such that there is no synchronization needed other than at the transfer. I'm betting that is how WoW does auctions and mailing of items.

My "run screaming; it sucks" article is my thinking about the core gameplay/simulator manipulated Entities. What Derek calls Real Time Data. To me that is the "hard problem". All the rest of the stuff can be handled by web-tech, and that is a solved problem (waves hand dismissively), and not so interesting.

Well. There are a few interesting issues, like coordinating authentication. But the coolest payoff (as Derek states) is that these things automatically become available offline via browsers, mobile devices, etc.

BTW, I'm working on another contentious article that more fully details the issues that drive my opinion about DB-centric game state management.