Can you predict review scores?

Brighton-based company claims to.

Brighton-based usability company Vertical Slice claims to be able to predict videogame review scores.

Speaking exclusively to Eurogamer, director Graham McAllister said his techniques can "inform [companies] to make a better game", pushing them closer to the "magic line" of 80 per cent.

"We just finished some research where we can start to predict a year in advance what the game is likely to get," McAllister told us. "And obviously we get more accurate as time goes on.

"This is brand new research: we're aiming to publish it this year. At the minute we're saying we can get it into these bands: low, medium or high. We're not saying we can predict an eight or a nine, that level of granularity - that's subjective behaviour and all sorts of things. That's nearly impossible, to be honest."

"People think you can't predict a game based on quantifiable data," he added. "What we can do is get these estimators. Some people will just have a hard job believing it. We have analysed the statistics to death, thorough and rigorous, and what we're saying is, 'You may not like it, but this is the best model that anyone has come up with to date.'"

Built on - and run out of - the University of Sussex, Brighton, Vertical Slice uses the brains of physicians and psychologists to compile a series of accurate, scientific tests. It's this approach McAllister feels has been "lacking in the industry to date".

The tests come from reverse-engineering 154 Edge magazine reviews (more on that later) and from something called "behavioural or sequential analysis" - otherwise known as what you say and do while playing videogames.

The latter is based on the work of marriage counsellor John Gottman, who predicted within five minutes whether the couple before him would stay together for the next five years. He was 97 per cent accurate, said McAllister, who has a PhD himself.

Thus, he said: "After 30 seconds, we can predict if the game is going to be bad or good, to a certain extent." Specifically, this means sorting them into bands: low (1-4), medium (5-7), and good (8-10).

"What's important about that first minute," he added, "is that it's the time people play a demo for. That's super critical."

Reverse engineering uses patterns of words or phrases from reviews and matches them with scores. "All the high-scoring games talk about certain aspects; all the medium-scoring games talk about certain things; and all the low-scoring games talk about certain things. And there's a very clear mapping between them."

Who Vertical Slice picks to test the games is handled just as carefully, and the net is cast much wider than usual.

"Four or five years ago people made games for gamers: geeks making games for other geeks, basically. Now you have people making games for everybody. The problem is that the games companies don't understand everybody," said McAllister. "We're very, very thorough about the people we choose to test the game."

Vertical Slice interviews those people militantly, and is investigating psychological profiling techniques to uncover bias tendencies. "No one else is doing these sorts of things. It's really detailed information on who you are testing with."

And you can't lie. "Biometrics is our big thing; we hook people up to equipment that will measure your heart-rate or skin response. If someone says, "This is the scariest game ever," we'll be able to say, "Really? Well, we don't think so." And we'll be able to prove it," said McAllister.

By adding all of those components together, McAllister said Vertical Slice can very accurately predict the outcome of a game.

"We look at what [the testers] say, we map it onto what they were doing at that time on the video, how their video was responding on the biometrics, and we say, 'Look, all these indicators are saying low engagement, this is not looking good, so what's the cause of that?'" he offered.

With that information, McAllister can suggest improvements. "We're not game designers," he said, "but we can influence the game design."

"80 per cent is just so important these days. This is the magic line, can we get our clients towards that or over it?"

Vertical Slice likes to work with companies as early as possible to make sure "they're on track to deliver a bloody good game". Most market research is done on Alpha builds, which is "too late", McAllister reckons. "We're trying to help them much, much further back".

"There's no reason why you would not want it," reasoned McAllister, "the return on investment is potentially huge. At the minute, our clients range from PS3 developers to iPhone developers."

He's not trying to put critics out of a job, either. The usability testing is designed to "complement" reviewers or "expert game players" as he flatteringly dubs them. Even functionality testers like Babel have their place. "All these people should be involved," he said.

"What we hope is that you're reviewing better games."

Pure developer Black Rock has worked with Vertical Slice before. Pure and Split/Second game designer Jason Avent reckons usability testing added 5-10 per cent to Pure scores, and is keen to work with Graham in the future.

"We'll be working with Graham in specific cases where we need independent views and testing or where we can make use of their expertise in rigorous and detailed scientific analysis," Avent told Eurogamer.

In closing, McAllister said he and Vertical Slice have nothing to hide; this is not sorcery but science.

"We're putting it out there and saying, 'Here's our model in detail; broken down, scientific methodology, data analysis, results.' Anybody can look at it, it's completely open, they can go and repeat our process and probably get the same data," he said. "It'll be interesting to see how everyone reacts, you know?"

Find out more over on the Vertical Slice website.

Comments (59) Latest comment 2 years ago

Comments threads automatically close after 30 days, but please feel free to continue chatting on the forum!

  • stevetuck #1 2 years ago

    Article reads like a 10 but then ends up being an 7 :(
  • gizmo #2 2 years ago

  • Vroom #3 2 years ago

  • tomdominer #4 2 years ago

    Wow! A company can predict if a game is going to be shit JUST BY TRYING IT OUT!
  • BlueDot #5 2 years ago

    "who predicted within five minutes whether the couple before him would stay together for the next five years. He was 97 per cent accurate"

    hmmm...What about self fullfiling prophecy ? He "knows" the end results so he will work harder to keep the couple together or don't bother trying too hard.
  • kendoji #6 2 years ago

    Interesting, but I really wonder how much more accurate this is than just getting a few gamers in and asking them if they like it or not.
  • sarcasmoidosis #7 2 years ago

    They can say if a game's shit, average or good by trying it for a while? Wow, that's quite a breakthrough :)
  • hiddenranbir #8 2 years ago

    At the minute we're saying we can get it into these bands: low, medium or high.

    Wow... I hope they did this all in their spare time. Otherwise we've seen a big waste of money.
  • lunnyt00n #9 2 years ago

    I, like everyone else on this site have been doing this for years.

    Go and spend some of that research money on a PS3 and start it Folding@Home. At least that research will be useful and money well spent.

    What a croc.
  • TOOTR #10 2 years ago

    I've just been reading about John Gottman's marraige prediction analysis in Malcolm Gladwells Book 'Blink'. It is fascinating stuff.

    Educate yourselves unbelievers! ;)

    If more usability testing and profiling is spent during a games development to improve games quality - please tell me how this is a BAD thing?

  • mizcicz #11 2 years ago

    i can predict that these clever guys sortet out a fine way to rip of the management guys in the gaming industry...well, they probably deserve it...
  • MMMarmite #12 2 years ago

    Thus, he said: "After 30 seconds, we can predict if the game is going to be bad or good, to a certain extent." Specifically, this means sorting them into bands: low (1-4), medium (5-7), and good (8-10).

    "What's important about that first minute," he added, "is that it's the time people play a demo for. That's super critical."


    So there's 30 seconds in a minute, that's where I've been going wrong all this time /o\
  • Cpt_McOneball #13 2 years ago

    Hang on, I can do that. If something looks shit. IT'S GOING TO BE SHIT. Another popular method of telling how a game is going to be aweful: Is it a movie tie-in?
  • Stompy #14 2 years ago

    "If more usability testing and profiling is spent during a games development to improve games quality - please tell me how this is a BAD thing?"

    We already have quality control for games.
    We already have dissenting voices saying, "wait a second, this sequel is barely an improvement at all. Are we just trying to milk the fans?"

    But we also have development budgets that can't be wasted needlessly, shareholders, and the need for profits.
    We also have buyers who will buy a bunch of chod just because it's branded with a rich chav's face. (Case in point: people actually listen to Chris Moyles).



    End result: no change, just earlier moaning.
  • CaoSlayer #15 2 years ago

    The could improve the accuracy by adding how much money is paid for ads in the media.
  • Rack #16 2 years ago

    These are Edge reviews, it's like they're trying to use statistical modelling to predict the roll of a dice.
  • giant_frying_pan #17 2 years ago

  • Anthony_UK #18 2 years ago

    Lost interest in that half way down...

    Really can't believe developers pay to have these people tell them whether there game is going to be crap, good or amazing! Surely they could have someone in house tell them the exact same thing? An honest opinion from the Q&A team? The producer? Or even inviting some people from there offical forums etc??

    Seriously if any of these dev's employed me to do the same job, I'd bet everything I own I would come up with the exact same results as these people!

    Blackrock, Infinity Ward, Bungie....Hell email/pm half a dozen people from this forum and I'm sure we'd be willing offer the exact same service for free!
    Edited by 1 at 04/09/09 @ 19:40
  • JahB #19 2 years ago

    An honest opinion from the Q&A team? The producer?

    all of these are too close for proper judgement; if you spend 40-80 hours a week with one game only, it's impossible to have an impartial opinion on it, so an "outside" set of eyes can see things the people on the inside might look over.

    however, usually outside playtesting fills that gap. throwing significant money into an agency like this seems a bit like overkill to me
  • grantc7 #20 2 years ago

    Woah, I would really like to see this on Dragons' den...
  • Spekingur #21 2 years ago

    QA teams are normally in-house. A third party that won't be afraid telling a company that the interface is cumbersome or if the game won't be all that good unless they change this and that - a company like that is a good thing.
  • Anthony_UK #22 2 years ago

    Actually asking the Q&A team probaly wouldn't be the best idea, fair point to the posts above.

    I just can't believe they need to pay another company to tell them something that should be fairly obvious to anyone remotely interested in videogames!

    Or maybe Im just jealous that I didn't come up with the idea myself!
  • Oh-Bollox #23 2 years ago

    If more usability testing and profiling is spent during a games development to improve games quality - please tell me how this is a BAD thing?

    It wont be used to improve game quality. It will be used to further rig the review scores they achieve.
  • LewisResolution #24 2 years ago

    So these hard scientific methods that they're throwing out there for everyone to see: where are they? They're not in this article, and not on their website.

    I'm genuinely interested actually. Might make a phonecall next week.
  • Velios #25 2 years ago

    So basically, a weird company preying on the insecurities of game developers and advising them to adopt a cookie-cutter approach to making video games in order achieve better scores.
  • TOOTR #26 2 years ago

    @Velios - not really. From what I can gather they are not advising the game makers per se on how to 'make' a game - they are testing people's reactions, emotions, conversations while playing the game.

    And by using something called 'slice analysis' can give the developers a good idea wether they are knocking it out of the park or are fustrating, boring, exciting or otherwise emotionally engaging the target audience. Using methods that simple Q&A's don't uncover.

    Its not 'review' rigging at all - but they are obviously selling it as a 'review predictor' because some companies are, rightly or wrongly, starting to KPI the teams with bonuses etc on metacritic scores.
  • Weezer #27 2 years ago

  • layleeloo #28 2 years ago

    That article title made me laugh. Most people on this site think they can score a game before its out.

    And what about PLAY.com. All the idiots who write reviews for films, games and CD's giving them 5 stars before they have even come out saying "did is gunna b wikd".

    Retards
  • Jonny5Alive7 #29 2 years ago

    They don't seem to do anything much out of the ordinary. Its just getting independent people to do a bit of testing, which virtually any member of the public with a bit of video game experience could do really. I think a lot of us can tell what score a game is going to be getting. For instance GTA IV was always going to be getting 10/10, GTA games are always very good and even if this one wasn't quite as good the hype would keep the score high.
  • Batfink #30 2 years ago

    If any publisher started using such a service to vett their upcoming releases, the smart developers should game the system to score a 'high' mark, regardless of the actual quality of the game.
  • gaselite #31 2 years ago

    First the DoA tits story and now this. Jesus.

    Videogames: Sometimes meritorious but too often formulaic and homogenous, and occasionally present backward, adolescent perspectives of the world. Things should improve when the medium emerges from its own adolescence. 6/10
  • Xerx3s #32 2 years ago

    Hurray for review scores based on hype expectancy! \0/
  • optimusprym8 #33 2 years ago

    Surely the main data they need is how much money is paid to a magazine for a score or a retail chain for an agreed chart position or who in the PR team wrote the review to send to the smaller non-gaming press
  • hyperkineticninja #34 2 years ago

    EA have an in-house game evaluation team that do speak up when a game is bad, obviously its mainly the studio's that are putting effort into a game and want it to be good that use this service, which is generally why you see such differing games like Dead Space --- GI. Joe.

    ***Edit: That is why FIFA does improve each year, as the game eval team get involved very early on in the process.
    Edited by 2 at 05/09/09 @ 10:31
  • barnettbeans #35 2 years ago

    Can you predict if its better than halo or not?
  • Bremenacht #36 2 years ago

    The science of opinion.
    Why not save all the cash and just read what people have to say on a forum?
  • Freek #37 2 years ago

    Yes, you could hire that company, or you could just ask "What do you think?" when ever the game is previewed and people get some hands on time with it.
  • byakuya83 #38 2 years ago

    They can better predict the review score when the game is further along in development, haha. Is this even necessary? I don't need any special software to predict a large majority of games reviewed on this very site will score the uncontroversial, middle of the road score, 7/10.
  • TheLittlestHobo #39 2 years ago

    I watched the video for Raven Squad and after 30 secs I could tell that it would be a turkey; the voice acting alone was a big red flag. Maybe I could start a new career as a video game quality consultant. I'd be much cheaper and I'll offer 6 bands ( low, low medium, medium, high medium, low high, high) indicated with easy to understand verbal metrics: "LOL!", "Shit!", "Alright", "Not Bad", "Pretty Good", "OMFG!". Going back to Raven Squad I would give that a "LOL!". I will await Eurogamers review and see how I fare, then will be happy to accept any offers.
  • Dr_Wadd #40 2 years ago

    I`m not entirely convinced about the claims that you can judge a game fairly within the first minute. Leaving aside the fact that a lot of games have opening cut-scenes that last longer than a minute, I can immediately think of a couple of examples where the first minute entirely failed to be representative of the game as a whole. Mass Effect I found to be a bit of a tedious chore at first and I have to admit I wasn`t overly impressed, but as soon as I got free reign of the Normandy I thoroughly enjoyed the game. The first minute of Ninja Blade is pretty much just a QTE event, so if I were to rate that game on the first minute I'd conclude it was nothing but QTEs. How about games that have flawed control schemes that, at first, appear to cripple the game but can be adapted to? I`m sure I`m going to get flamed for this, but I actually enjoyed the Iron Man game on the 360, but only once I had gotten used to the obtuse control scheme. Once I had that nailed I found it a lot of fun, but the immediate impression was that it was totally uncontrollable.

    I`m curious about their biometric testing. From the description it sounds like they are using polygraph type kit, but that can be woefully inaccurate. It's not clear from the description whether they measure things like heart rate while the game is being played, or if it is used to pick up on untruths during the player's feedback. With the former approach I can see some merit, not so for the latter.

    I`m generally wary of their claims, but I would have to see a lot more detail in order to make a totally fair judgement. Might have to drop them a line, I need to find a new job and this would seem a top way of combining my programming skills, love of gaming and Psychology PhD.
  • jonbwfc #41 2 years ago

    "The tests come from reverse-engineering 154 Edge magazine reviews (more on that later)"

    I don't see any more on that. Maybe you should get someone to scientifically analyse your articles?
  • Dr_Wadd #42 2 years ago

    @ jonbwfc, it's there, but not with a great amount of detail.

    Reverse engineering uses patterns of words or phrases from reviews and matches them with scores. "All the high-scoring games talk about certain aspects; all the medium-scoring games talk about certain things; and all the low-scoring games talk about certain things. And there's a very clear mapping between them."
  • kentmonkey #43 2 years ago

    ShopTo seem to be able to predict the review score and</a> the text! ;o)
  • RobotRocker #44 2 years ago

    7/10 for PS3 exclusives is on the mark considering you still have to average out for Lair

    /stokes the BBQ
  • Clive_Dunn #45 2 years ago

    Yes Dr McAllister thank you. Clearly the publisher has no clue if the title is great or a crock of shite before the reviewers get their hands on it.

    Your consultancy fee will be well justified.
  • bubu.3k #46 2 years ago

    well such a company well intended might actually do good pushing games to a better quality. A company just looking for money might do the same thing buy with a lesser chance of success tho.
  • davisorle #47 2 years ago

    So they are researchers who will be able after a long time to tell us if the tittle that will be published a year from now will be bad, ok or good? Wow I should be paid for it then since I do believe I can do that myself from now. Should hire me. Im willing to be paid for that too as a part time job. Ddid those ppl run out of ideas on how to scum or something? lol
  • Discalceaterabbit #48 2 years ago

    I'm surprised they didn't show the equation,

    Review body + game * advertising revenue from publisher = review score
  • Gearskin #49 2 years ago

    Rule of Thumb - Eurogamer always goes one above or one below the norm. And on occasion makes up a number between one and ten. Like Threesix for example.
    Edited by 1 at 06/09/09 @ 01:31
  • comissars_handgun #50 2 years ago

    I guess this kind of thing is inevitable with games getting more mainstream. I hope it doesn't get into the same test screening insanity that ruins a lot of Hollywood movies though. "The best reviewed games recently have had computer hacking minigames. Add this to your golf game and it's sure to get a 9"
  • IronCladChicken #51 2 years ago

    If more usability testing and profiling is spent during a games development to improve games quality - please tell me how this is a BAD thing?

    I'd guess for the same reason test screening is bad for movies.

    Edit: oops - sorry commissar's_handgun, I didn't see your post there!
    Edited by 2 at 06/09/09 @ 14:56
  • Nephirion #52 2 years ago

    I predict World of Warcraft Cataclysm will get a 10, Starcraft 2 will also get 10 and Diablo 3 will also be awarded 10 nothing to do with the brown envelope full of notes that was given to the press by some guy in a dodgy grey mack/
  • My_Account #53 2 years ago

    I predict that Halo 3 will geta 10/10
    /writes self large pay check
  • mrbandersnatch #54 2 years ago

    File this article under "trade advertising".

    The big boys already spend an awful lot of money on play testing and research both in and out of house (I see a good few games in alpha and beta due to the marketing nature of my work). There currently is no real industry standard as far as performing this kind of research/testing and it sounds like Vertical Slice are trying to address this, or would be if they published their methodology. Unless they publish then it appears they are just looking for some free market awareness articles. If they do publish...personally I'd love a copy and I might just take them up on replicating it :)
  • Mkwone #55 2 years ago

    i think most people can gues a score to a relativly high degree of accuracy after playin it for 5 minutes, heck i bet alot of people can guess a score going by name alone.

    If i said Uncharted 2 i think most people would guess a score of 8/9 out of 10
  • schnide #56 2 years ago

    Most of the respondents here are idiots. There's something in this, and if it results in better games then why complain?

    Vertical Slice will make a shitload of money too, and well done to them.
  • guernican #57 2 years ago

    What's interesting about this is the comment about biometric feedback... presumably galvanic skin response, tracking of the eye movements and suchlike.

    Almost like a Voigt Kampff test, in fact.

    Thing is, marketing companies have more or less binned these techniques in favour of EEG cognitive testing. If you really want your new product stress tested before you send it to market, you use EEG. There's a company in London called Neurofocus that'll do it for 30 grand.

    So the Brighton boys are using outmoded tech to give a very broadbrush figure. Rock n roll.
  • swisstony #58 2 years ago

    Presumably this works for MMOs?
    I wonder if one minute with the game of Chess would also yield an accurate review score.
  • loopy #59 2 years ago

    "pushing them closer to the "magic line" of 80 per cent."

    It's a shame for them that I only tend to buy games above 85%. :p