If someone asks you how a game is, how do you answer? Most people are going to say “It’s good” or “I didn’t like it” or “it sucked out loud” and then explain their answer a bit. What I’m pretty sure they’re not going to do is say something like “I give it an 82 out of 100” right out of the gate. It wouldn’t make any sense to do that, because despite being a more specific metric it doesn’t really tell you what the person thought of the game. Numbers require context, yet in critiques they’re used as summary evaluations that can suffice for entire reviews. I’m not a fan of that kind of labeling, I avoid doing it myself, and today we’re going to look at why.
I’ve pretty much made my point already but to better illustrate the issues with points and ranking systems, we’ll go through the many varieties you’ll find out there in the wildlands of game journalism. Since I live most of my life on Steam let’s start with their user rating system, which is actually just a simple question: Do you recommend this game? The answer, yes or no, is the only systemic rating you can give the game that the Steam platform itself recognizes, and which is used to calculate aggregate review scores and sort user reviews. It’s a binary system, thumbs up or thumbs down, Siskel & Ebert style. Many users have complained about its limited structure over the years, begging for more ratings or a neutral rating or other tweaks. But for whatever reason, in this one instance Valve has refused to mess with perfection.
That’s right, perfection. I maintain that a simple yes-or-no system is the ideal rating system for consumers. I’ve already touched a bit on why, but the key is that it’s the closest to an actual value judgment that you can get from criticism. It’s asking the reviewer to put themselves on the spot for the enjoyment of others, to recommend something that will entertain or discourage something that will not. The moment you introduce a more nuanced system than that the message gets muddled, and it gets harder and harder to tell if you’re looking at an actual recommendation or not.
The binary approach also puts more importance on the actual review, and discourages boiling the evaluation down to an arbitrary figure. I’m obviously biased on this front because I write things that I want people to read, but again I want you to return to the original question of how you tell others what you think about games. If someone poses the question “Do you recommend this game?” you can answer a simple yes or no, but odds are that you’ll contextualize your answer with “Yes, but…” or “No, because…” as a courtesy to the questioner. We all know that curt answers are less than helpful so we add details that give more information on what we liked and what we didn’t and why, to help our friend make an educated decision. And to me that’s the whole point of a review, to give the reader the information they need to decide on a purchase.
You can look at the need for context as a weakness of the binary rating system, but the truth is that no rating system is descriptive enough on its own. As an example of this we can look another level up from binary to the only other system I use, the five-point scale. Popularized by some movie critics and then brought to the public by Amazon and the App Store, five-point scales allow for a little more wiggle room than pure thumbs up or down. Games that are truly excellent or truly awful get their own spaces at the ends of the spectrum, and the indecisive get a convenient middle-point to couch their opinions in. But already you should be able to see the weakness in a wider system, where it leaves open the question “What constitutes a recommendation?”
The moment you leave a binary system, the system itself will require context to understand. Since review scores require the context of the review anyway, this adds a second layer of interpretation to the process. Yes, you can look at a five-point system and just filter down to the 5s, but then what will you miss by leaving out the 4s? If you include the 4s, what is it about them that holds them back from being 5s? Is 3 good or bad? You can make the evaluation more granular but granularity has to be explained, which subverts the purpose of a more granular system in the first place. It’s far clearer to go “I recommend this game because…” than “I give this a 3 out of 5 because…” simply because the numbers add more steps between the review and the recommendation.
As I said, I use a five-point system here on the website despite my misgivings about it. I do this to provide an additional tool to sort through hundreds of reviews with, not to provide shorthand for which games I recommend. For me a higher rating indicates a higher likelihood that the game will suit the reader (absent of context) and I say as much where I explain the system. But again, it’s not a system I expect people to fully understand without an explanation. A large part of that is because rating systems can be absurdly subjective, even on a limited scale. I’ve heard of workplace employee ratings that were on a 5-point scale, but that ratings of 1 or 5 were discouraged because 1 was beyond unacceptable and 5 was literally ideal. That left only three ratings, from 2 to 4, which broke down to “bad”, “acceptable”, and “great”. It shouldn’t come as a surprise to learn that most evaluations landed on 3.
This is the big problem with rating systems that get expanded out to 10- or 100-point scales, and one that should be obvious from a cursory glance at Metacritic. As the system grows in complexity it begins to move towards familiar ranges, and for video games this has consolidated around the 70-90 range for “good” games. Most games that are well-received hover somewhere in this area, while anything that falls below gets asked some hard questions, and virtually nothing ever rises above the pack. Outlets like PCGamer have had a major hand in propagating this system, as can be seen in their own breakdown of scores. Anything that’s good or great ends up in the 70s or 80s, conditional recommendations are 60s, and then the rest are flavors of crap.
What’s particularly insidious about 100-point systems is how they further subdivide within each range. A good game might range from 70 to 79, but what exactly is the difference between a 72 and a 73? That difference should be elucidated in the review itself, but breaking down the rating to such a level distracts from what should be a clear evaluation. It also builds on the problem of review scores clustering around certain ranges because the reviewer has more wiggle room within that range. Even a 10-point scale requires some commitment to individual scores, but having essentially a 10-point scale within a 10-point scale only makes it easier to fudge uncomfortable reviews.
The point, if that hasn’t become abundantly clear through constant mention, is that there’s no adequate substitute for reading a quality review. My objection to elaborate scoring systems is that they distract from doing just that, while not actually clarifying the reviewer’s opinion in any way. Adding more ranges and numbers doesn’t reflect the evaluation any more than a simple yes-or-no, but gives the illusion of specificity. And I haven’t even bothered to touch on reviews that break scores into categories like graphics and fun factor and then aggregate a total because that increases the problems outlined here by orders of magnitude. In the end, all I’m asking is that you not be taken in by big numbers on Metacritic or the Steam store pages. A number isn’t going to tell you what you need to know about a game, because all you need is that honest answer to that simplest of questions: Do you recommend this game?