How do you score wildly different games?
In my previous post in this series, I attempted to answer the question of what purpose review scores serve. To distill out last week’s rambling diatribe, review scores serve three purposes:
- Communicate succinctly to the reader the reviewer’s overall opinion of the game.
- Rank and compare games against one another.
#2 creates the need I was discussing at length last week, the need for differentiation. A good system has to have sufficient granularity to distinguish between games so that the “bins” of equal games do not become uselessly large.
But this brings up a natural question: how do you score games that are vastly different in genre, style, content, console, background, development, intent, or size? How do you compare Halo to Final Fantasy 13? How do you compare Super Mario Galaxy 2 to Super Mario 3D Land? How do you compare Angry Birds to Assassin’s Creed? How do you compare The Elder Scrolls V: Skyrim, Batman: Arkham City, The Legend of Zelda: Skyward Sword, and Bastion? If the intention of a scoring system is to rank and compare games, then these games (in theory) ought to be scored on the same scoring spectrum. But how in the world can you put such rate such different games against one another?
That final comparison I mentioned is intentional because that’s the comparison I’ve seen drawn most often by the recent Game of the Year awards doled out by every gaming outlet in the country (don’t worry, ours are coming soon). Nearly all of the rankings I’ve seen have chosen one of those four games as the Game of the Year, and comparing any of them presents incredible challenges. Do you mark down Skyward Sword for being so similar to its predecessors because it means it was responsible less actual ‘new’ gameplay? Or do you give it full credit solely based on how much fun the game itself is? Do you give Skyrim extra points because its playtime is so much longer, or do you dock it because the experience is largely what the player makes of it? On the flip side, do you dock Batman: Arkham City for being relatively short by modern standards, or do you actually give it more credit because it managed to be so entertaining in such a short amount of time? And Bastion presents perhaps the biggest question of all: do you give it extra points because it was a low-budget, low-price, indie game, or do you compare it toe-to-toe with the big boys?
Those questions are all made somewhat explicit in Game of the Year rankings, but they also are implicit in any numerical ranking system. When you give a game like Bastion a 9 out of 10, you’re subscribing to some school of thought that dictates how the game ought to be compared to others. I would posit that if you were put into a cultural vacuum and allowed to play those four games, Bastion would not even be in the same league as the other three; but because you have the context of its development ad its price, and because you have the context of forty years of video game history, you view the game in a different light. That’s not meant to invalidate praise of Bastion, it’s only meant to highlight the fact that whether or not you realize it and whether or not you intentionally do so, you’re subscribing to a school of thought whenever you give a game a rating.
There are dozens of different dimensions this question can take on. Take, for example, innovation. Recall a couple years ago the release of the football off-shoot Backbreaker. The idea of Backbreaker was a football simulation grounded completely in a real physics engine: no pre-recorded tackles, no probabilities of catching vs. whiffing, no offensive line rankings to determine pass protection; everything was grounded directly in the physics of the playing field. On the other hand, Madden NFL 11 released that year, and while it had some innovations, it was largely just another instantiation of the same game we’d seen incrementally developed over the previous decade.
How do you compare the two? If you were to take the comparison strictly from how fun each game is, Madden NFL 11 wins in a landslide. I played both, and while I loved Backbreaker‘s new approach and think it has an incredible amount of potential, the game as a whole was too glitchy and plodding to really compete with the raw fun that is had from Madden NFL 11. But at the same time, people point every year to how the Madden NFL games under go only minor incremental improvements year after year: for example, Madden NFL 11 introduced little beyond a new play-calling system and a new online game mode. Do you dock Madden NFL 11 points for the fact that it has so little new content compared to Backbreaker? There’s also the issue of NFL licensing. Having officially-licensed NFL franchises certainly adds to the game experience, but do you dock Backbreaker for lacking that even though it was a business constraint rather than a design decision? What about Backbreaker‘s physics engine? It was glitchier than Madden NFL‘s system, but Madden NFL has had a decade to refine and perfect their system. Do you hold the glitches against Backbreaker as compared to Madden NFL when they were created under such different circumstances?
Comparing those two games is inordinately difficult when you consider the context of the games and their development, and yet as far as gameplay goes, you’d be hard-pressed to find any pair of games that are as similar as a pair of sports games about the same sport. If comparing those two games is that difficult, how can you ever expect to compare games like The Elder Scrolls V: Skyrim to Bastion?
The easy answer to this question is: you don’t. You describe and analyze and review the game for its own merits, but you do not attempt to give the game any kind of assessment that will rank it compared to games that are very different. Unfortunately, that’s neither a likely nor a good idea. Scoring systems are not going anywhere, but more importantly, they don’t necessarily need to go anywhere. They have their merits. Ranking, differentiation, and succinct communication are good goals to have. The question is just how the system ought to work. Next week, we’ll examine some of the possibilities and ask what roles various qualities of the game should play in assigning it a numeric ranking. Among these, we’ll talk about play time, enjoyment, artistic merit, innovation, fan service, developer context, price, replay value, and any other factor I can think of over the next week. Or that you can think of — feel free to leave your idea on how a game should be scored in the comments below.