37, 1004, 3102, 90000… seemingly random and innocuous numbers. However, if you play video games, these numbers may be familiar to you. Let me explain – error 37 was what greeted many fans of Diablo 3 during its launch. Error 1004 was an issue experienced by some SimCity players when it launched this year, but the number was picked to help illustrate my point – the fact of the matter is that SimCity gave good old text error messages when it launched, or rather, attempted to launch. The last two… well they’re the reason I’m writing this article, and they both belong to the most recent game to experience launch issues – Final Fantasy XIV.
Wait a second, you say – Final Fantasy XIV? But it’s release date is listed as August 27th, which isn’t even here yet! No, not really – a week ago the game went through it’s last beta test, an open beta designed to stress the servers, which also carried the perk of not including a server wipe. Yes, it was essentially open beta combined with early access. Issues were discovered, because that’s what a beta test is for, and then this weekend arrived, scheduled as the actual early access period; full launch is occurring on the aforementioned August 27th. Sounds like a wonderful plan, doesn’t it? Get a bunch of people in to load up the servers, make sure everything is okay, then unleash the game upon the world. Except it didn’t go so well, and many players have been unable to get into their accounts, experiencing the errors above or other general server load issues. (I’ve got to feel like there’s a tiny bit of irony there considering Final Fantasy XIV is actually in the process of re-launching after their initial launch presented a game that was met with low review scores and disappointed players.)
However, I’m not writing this to pick on FFXIV. In fact, like other games before it, I know it will weather the launch issues and eventually be judged based on the game it is, and I wish everyone at Square-Enix the best with the game – so far I’ve enjoyed what I’ve been able to play of it, if that is worth anything. What I’ve really got to wonder is how there are companies which struggle to get their products launched smoothly. I question that because the number of players expecting to get into these games is a known quantity – in the age of digital sales and distribution, a company can look at sales records and instantly know what kind of population they’re going to have to support. Even adding a physical distribution method as FFXIV has done, there is still the fact that you must create an account and register a key to that account to get access to the game – another easy tracking point. If you know as near to exactly the maximum number of people who will be knocking on the door come launch day, why haven’t you done everything you can to prepare?
I think there are two major factors that need to be considered in answering that question. First, the optimization of the software itself and understanding of load characteristics so you know how many servers you’re going to need to support your incoming player base. Second, once you’ve established those load numbers, a there needs to be a balance struck between under-provisioning your service and causing load issues, and over-provisioning and causing a financial burden upon yourself with no tangible benefit. As a point of clarification here, when I say “server” I’m talking about a single physical or virtualized instance of hardware, whereas often in the online game space “server” applies to what also can be referred to as a “shard” – an instance of the game world, identical but separated from other instances of the game world, generally for the purpose of load management. To confuse the issue, a shard may actually be made up of multiple servers.
I feel like load testing, understanding load characteristics, and general server side optimization should be well known quantities at this point. In the early-mid 2000s I was on a team launching and operating the highest traffic Canadian-based websites around, and we regularly ran in to load problems that had to get dealt with. To be clear, some of these sites were games in and of themselves, and while an online game is still more complex than those, the techniques that went into our load profiling would be similar for an online game. Unfortunately, these activities often fall under the purview of quality assurance, and many in the industry will tell you these poor souls are asked to move mountains with too few people and not enough money in too little time. That may not be the case in any of the examples above, but I wonder if it had an impact.
Let’s say it didn’t, and perhaps a company has a precise understanding of exactly how many people they can squeeze into a server before things start to go sideways. With that out of the way, at this point the discussion is “How many servers do we launch with?” Overshoot and you’ve spent time, effort, and money provisioning that hardware for no reason. Undershoot, and, well… error 37. This is where I think a couple adjustments to launch day thinking could make all of the difference for future titles.
Before discussing the financial reasons, it is important to point out and deal with one of the major game play reasons not to launch with too many shards (and therefor servers). Any new game is going to have an influx of people who want to check it out and see if they enjoy it, and obviously the concentration of these players is going to be highest at launch. At some point after launch these people will move on, which can leave a shard with a low population. In terms of an online game, this can be incredibly frustrating for those still playing on that shard, perhaps causing some of them to leave, creating a downward spiral effect. In my mind, the best answer here is simply to do away with shards altogether. They came about in a time when technology was not at a point that would support everyone on a single shard, and they were the simple answer. I no longer see a good technological reason for them, let’s stop using them.
Having eliminated shards, the next step is having the appropriate amount of hardware on hand for launch. There is clearly a large financial incentive to not overspend on hardware that will sit idle shortly after launch. However, having worked with hardware vendors for years, I suspect it wouldn’t be hard to persuade a company to come to some kind of arrangement during the launch window for something as simple as publicity. Can you imagine millions of players experiencing a smooth Diablo 3 launch with a graphic during the loading screen that said “This smooth launch brought to you by Dell”? There’s actually some precedent towards this – it’s well known that CCP uses IBM hardware in their environment upon which they run EvE Online (and presumably Dust 514 now as well). Maybe I’m completely wrong on this point, but I believe there’s potential on both sides of that agreement for some substantial benefit. Alternatively there’s room to discuss offloading processing to scalable cloud services like AWS rather than hosting all of your own processing, although that’s something you have to be planning for pretty early on and is a whole different discussion.
Am I talking crazy talk over here? Are these completely unreasonable thoughts? I have no idea, because I don’t work in the gaming industry. I do have an awful lot of experience in making stuff work online though, and I think these are some steps in the right direction. If somebody has a launch in the future and wants to test me on it, I’m open to offers! ;)