Jump to navigation
Advertisement

Can you predict review scores? News

PC PlayStation 2 PSP DS Xbox 360 PlayStation 3 Wii
News by Robert Purchese

4 September, 2009

Brighton-based usability company Vertical Slice claims to be able to predict videogame review scores.

Speaking exclusively to Eurogamer, director Graham McAllister said his techniques can "inform [companies] to make a better game", pushing them closer to the "magic line" of 80 per cent.

"We just finished some research where we can start to predict a year in advance what the game is likely to get," McAllister told us. "And obviously we get more accurate as time goes on.

"This is brand new research: we're aiming to publish it this year. At the minute we're saying we can get it into these bands: low, medium or high. We're not saying we can predict an eight or a nine, that level of granularity - that's subjective behaviour and all sorts of things. That's nearly impossible, to be honest."

"People think you can't predict a game based on quantifiable data," he added. "What we can do is get these estimators. Some people will just have a hard job believing it. We have analysed the statistics to death, thorough and rigorous, and what we're saying is, 'You may not like it, but this is the best model that anyone has come up with to date.'"

Built on - and run out of - the University of Sussex, Brighton, Vertical Slice uses the brains of physicians and psychologists to compile a series of accurate, scientific tests. It's this approach McAllister feels has been "lacking in the industry to date".

The tests come from reverse-engineering 154 Edge magazine reviews (more on that later) and from something called "behavioural or sequential analysis" - otherwise known as what you say and do while playing videogames.

The latter is based on the work of marriage counsellor John Gottman, who predicted within five minutes whether the couple before him would stay together for the next five years. He was 97 per cent accurate, said McAllister, who has a PhD himself.

Thus, he said: "After 30 seconds, we can predict if the game is going to be bad or good, to a certain extent." Specifically, this means sorting them into bands: low (1-4), medium (5-7), and good (8-10).

"What's important about that first minute," he added, "is that it's the time people play a demo for. That's super critical."

Reverse engineering uses patterns of words or phrases from reviews and matches them with scores. "All the high-scoring games talk about certain aspects; all the medium-scoring games talk about certain things; and all the low-scoring games talk about certain things. And there's a very clear mapping between them."

Who Vertical Slice picks to test the games is handled just as carefully, and the net is cast much wider than usual.

"Four or five years ago people made games for gamers: geeks making games for other geeks, basically. Now you have people making games for everybody. The problem is that the games companies don't understand everybody," said McAllister. "We're very, very thorough about the people we choose to test the game."

Vertical Slice interviews those people militantly, and is investigating psychological profiling techniques to uncover bias tendencies. "No one else is doing these sorts of things. It's really detailed information on who you are testing with."

And you can't lie. "Biometrics is our big thing; we hook people up to equipment that will measure your heart-rate or skin response. If someone says, "This is the scariest game ever," we'll be able to say, "Really? Well, we don't think so." And we'll be able to prove it," said McAllister.

By adding all of those components together, McAllister said Vertical Slice can very accurately predict the outcome of a game.

"We look at what [the testers] say, we map it onto what they were doing at that time on the video, how their video was responding on the biometrics, and we say, 'Look, all these indicators are saying low engagement, this is not looking good, so what's the cause of that?'" he offered.

With that information, McAllister can suggest improvements. "We're not game designers," he said, "but we can influence the game design."

"80 per cent is just so important these days. This is the magic line, can we get our clients towards that or over it?"

Vertical Slice likes to work with companies as early as possible to make sure "they're on track to deliver a bloody good game". Most market research is done on Alpha builds, which is "too late", McAllister reckons. "We're trying to help them much, much further back".

"There's no reason why you would not want it," reasoned McAllister, "the return on investment is potentially huge. At the minute, our clients range from PS3 developers to iPhone developers."

He's not trying to put critics out of a job, either. The usability testing is designed to "complement" reviewers or "expert game players" as he flatteringly dubs them. Even functionality testers like Babel have their place. "All these people should be involved," he said.

"What we hope is that you're reviewing better games."

Pure developer Black Rock has worked with Vertical Slice before. Pure and Split/Second game designer Jason Avent reckons usability testing added 5-10 per cent to Pure scores, and is keen to work with Graham in the future.

"We'll be working with Graham in specific cases where we need independent views and testing or where we can make use of their expertise in rigorous and detailed scientific analysis," Avent told Eurogamer.

In closing, McAllister said he and Vertical Slice have nothing to hide; this is not sorcery but science.

"We're putting it out there and saying, 'Here's our model in detail; broken down, scientific methodology, data analysis, results.' Anybody can look at it, it's completely open, they can go and repeat our process and probably get the same data," he said. "It'll be interesting to see how everyone reacts, you know?"

Find out more over on the Vertical Slice website.

Advertisement

Want to comment on this article? Log in, or register!

Comments: 1-50 of 60 in total | next 50 »

Poster
Comment Low-scoring comments hidden. Log in to see them!
stevetuck
04/09/09 @ 16:32
#1
+37
You buried this comment
Comment below viewing threshold
Show
Article reads like a 10 but then ends up being an 7 :(
gizmo
04/09/09 @ 16:32
#2
+8
You buried this comment
Comment below viewing threshold
Show
8/10
Vroom!
04/09/09 @ 16:34
#3
-4
You buried this comment
Comment below viewing threshold
Show
Lol.
tomdominer
04/09/09 @ 16:44
#4
+38
You buried this comment
Comment below viewing threshold
Show
Wow! A company can predict if a game is going to be shit JUST BY TRYING IT OUT!
BlueDot
04/09/09 @ 16:45
#5
-4
You buried this comment
Comment below viewing threshold
Show
"who predicted within five minutes whether the couple before him would stay together for the next five years. He was 97 per cent accurate"

hmmm...What about self fullfiling prophecy ? He "knows" the end results so he will work harder to keep the couple together or don't bother trying too hard.
kendoji
04/09/09 @ 16:46
#6
+12
You buried this comment
Comment below viewing threshold
Show
Interesting, but I really wonder how much more accurate this is than just getting a few gamers in and asking them if they like it or not.
sarcasmoidosis
04/09/09 @ 16:51
#7
+11
You buried this comment
Comment below viewing threshold
Show
They can say if a game's shit, average or good by trying it for a while? Wow, that's quite a breakthrough :)
hiddenranbir
04/09/09 @ 16:55
#8
+8
You buried this comment
Comment below viewing threshold
Show
At the minute we're saying we can get it into these bands: low, medium or high.

Wow... I hope they did this all in their spare time. Otherwise we've seen a big waste of money.
lunnyt00n
04/09/09 @ 17:02
#9
+5
You buried this comment
Comment below viewing threshold
Show
I, like everyone else on this site have been doing this for years.

Go and spend some of that research money on a PS3 and start it Folding@Home. At least that research will be useful and money well spent.

What a croc.
20charactersmax
04/09/09 @ 17:06
#10
0
You buried this comment
Comment below viewing threshold
Show
Only EG review scores.
TOOTR
04/09/09 @ 17:13
#11
+5
You buried this comment
Comment below viewing threshold
Show
I've just been reading about John Gottman's marraige prediction analysis in Malcolm Gladwells Book 'Blink'. It is fascinating stuff.

Educate yourselves unbelievers! ;)

If more usability testing and profiling is spent during a games development to improve games quality - please tell me how this is a BAD thing?

mizcicz
04/09/09 @ 17:13
#12
+2
You buried this comment
Comment below viewing threshold
Show
i can predict that these clever guys sortet out a fine way to rip of the management guys in the gaming industry...well, they probably deserve it...
MMMarmite
04/09/09 @ 17:27
#13
+3
You buried this comment
Comment below viewing threshold
Show
Thus, he said: "After 30 seconds, we can predict if the game is going to be bad or good, to a certain extent." Specifically, this means sorting them into bands: low (1-4), medium (5-7), and good (8-10).

"What's important about that first minute," he added, "is that it's the time people play a demo for. That's super critical."


So there's 30 seconds in a minute, that's where I've been going wrong all this time /o\
Cpt_McOneball
04/09/09 @ 17:27
#14
+1
You buried this comment
Comment below viewing threshold
Show
Hang on, I can do that. If something looks shit. IT'S GOING TO BE SHIT. Another popular method of telling how a game is going to be aweful: Is it a movie tie-in?
Stompy
04/09/09 @ 17:43
#15
+2
You buried this comment
Comment below viewing threshold
Show
"If more usability testing and profiling is spent during a games development to improve games quality - please tell me how this is a BAD thing?"

We already have quality control for games.
We already have dissenting voices saying, "wait a second, this sequel is barely an improvement at all. Are we just trying to milk the fans?"

But we also have development budgets that can't be wasted needlessly, shareholders, and the need for profits.
We also have buyers who will buy a bunch of chod just because it's branded with a rich chav's face. (Case in point: people actually listen to Chris Moyles).



End result: no change, just earlier moaning.
CaoSlayer
04/09/09 @ 17:55
#16
+2
You buried this comment
Comment below viewing threshold
Show
The could improve the accuracy by adding how much money is paid for ads in the media.
Rack
04/09/09 @ 18:04
#17
+2
You buried this comment
Comment below viewing threshold
Show
These are Edge reviews, it's like they're trying to use statistical modelling to predict the roll of a dice.
giant_frying_pan
04/09/09 @ 18:30
#18
+1
You buried this comment
Comment below viewing threshold
Show
Better than Halo.
Anthony_UK
04/09/09 @ 18:33
#19
+2
You buried this comment
Comment below viewing threshold
Show
Lost interest in that half way down...

Really can't believe developers pay to have these people tell them whether there game is going to be crap, good or amazing! Surely they could have someone in house tell them the exact same thing? An honest opinion from the Q&A team? The producer? Or even inviting some people from there offical forums etc??

Seriously if any of these dev's employed me to do the same job, I'd bet everything I own I would come up with the exact same results as these people!

Blackrock, Infinity Ward, Bungie....Hell email/pm half a dozen people from this forum and I'm sure we'd be willing offer the exact same service for free!
Edited 1 times, most recently on 04/09/09 @ 19:40
JahB
04/09/09 @ 18:47
#20
+3
You buried this comment
Comment below viewing threshold
Show
An honest opinion from the Q&A team? The producer?

all of these are too close for proper judgement; if you spend 40-80 hours a week with one game only, it's impossible to have an impartial opinion on it, so an "outside" set of eyes can see things the people on the inside might look over.

however, usually outside playtesting fills that gap. throwing significant money into an agency like this seems a bit like overkill to me
grantc7
04/09/09 @ 18:47
#21
0
You buried this comment
Comment below viewing threshold
Show
Woah, I would really like to see this on Dragons' den...
Spekingur
04/09/09 @ 18:49
#22
+4
You buried this comment
Comment below viewing threshold
Show
QA teams are normally in-house. A third party that won't be afraid telling a company that the interface is cumbersome or if the game won't be all that good unless they change this and that - a company like that is a good thing.
Anthony_UK
04/09/09 @ 19:06
#23
0
You buried this comment
Comment below viewing threshold
Show
Actually asking the Q&A team probaly wouldn't be the best idea, fair point to the posts above.

I just can't believe they need to pay another company to tell them something that should be fairly obvious to anyone remotely interested in videogames!

Or maybe Im just jealous that I didn't come up with the idea myself!
Oh-Bollox
04/09/09 @ 19:26
#24
0
You buried this comment
Comment below viewing threshold
Show
If more usability testing and profiling is spent during a games development to improve games quality - please tell me how this is a BAD thing?

It wont be used to improve game quality. It will be used to further rig the review scores they achieve.
LewisResolution
04/09/09 @ 19:58
#25
+1
You buried this comment
Comment below viewing threshold
Show
So these hard scientific methods that they're throwing out there for everyone to see: where are they? They're not in this article, and not on their website.

I'm genuinely interested actually. Might make a phonecall next week.
Velios
04/09/09 @ 20:05
#26
+1
You buried this comment
Comment below viewing threshold
Show
So basically, a weird company preying on the insecurities of game developers and advising them to adopt a cookie-cutter approach to making video games in order achieve better scores.
TOOTR
04/09/09 @ 20:14
#27
+2
You buried this comment
Comment below viewing threshold
Show
@Velios - not really. From what I can gather they are not advising the game makers per se on how to 'make' a game - they are testing people's reactions, emotions, conversations while playing the game.

And by using something called 'slice analysis' can give the developers a good idea wether they are knocking it out of the park or are fustrating, boring, exciting or otherwise emotionally engaging the target audience. Using methods that simple Q&A's don't uncover.

Its not 'review' rigging at all - but they are obviously selling it as a 'review predictor' because some companies are, rightly or wrongly, starting to KPI the teams with bonuses etc on metacritic scores.
Weezer
04/09/09 @ 21:20
#28
-2
You buried this comment
Comment below viewing threshold
Show
Bollocks.
layleeloo
04/09/09 @ 21:48
#29
+4
You buried this comment
Comment below viewing threshold
Show
That article title made me laugh. Most people on this site think they can score a game before its out.

And what about PLAY.com. All the idiots who write reviews for films, games and CD's giving them 5 stars before they have even come out saying "did is gunna b wikd".

Retards
Jonny5Alive7
04/09/09 @ 21:53
#30
0
You buried this comment
Comment below viewing threshold
Show
They don't seem to do anything much out of the ordinary. Its just getting independent people to do a bit of testing, which virtually any member of the public with a bit of video game experience could do really. I think a lot of us can tell what score a game is going to be getting. For instance GTA IV was always going to be getting 10/10, GTA games are always very good and even if this one wasn't quite as good the hype would keep the score high.
Batfink
05/09/09 @ 00:13
#31
0
You buried this comment
Comment below viewing threshold
Show
If any publisher started using such a service to vett their upcoming releases, the smart developers should game the system to score a 'high' mark, regardless of the actual quality of the game.
gaselite
05/09/09 @ 06:18
#32
0
You buried this comment
Comment below viewing threshold
Show
First the DoA tits story and now this. Jesus.

Videogames: Sometimes meritorious but too often formulaic and homogenous, and occasionally present backward, adolescent perspectives of the world. Things should improve when the medium emerges from its own adolescence. 6/10
Xerx3s
05/09/09 @ 08:34
#33
0
You buried this comment
Comment below viewing threshold
Show
Hurray for review scores based on hype expectancy! \0/
optimusprym8
05/09/09 @ 08:58
#34
+2
You buried this comment
Comment below viewing threshold
Show
Surely the main data they need is how much money is paid to a magazine for a score or a retail chain for an agreed chart position or who in the PR team wrote the review to send to the smaller non-gaming press
hyperkineticninja
05/09/09 @ 09:29
#35
+4
You buried this comment
Comment below viewing threshold
Show
EA have an in-house game evaluation team that do speak up when a game is bad, obviously its mainly the studio's that are putting effort into a game and want it to be good that use this service, which is generally why you see such differing games like Dead Space --- GI. Joe.

***Edit: That is why FIFA does improve each year, as the game eval team get involved very early on in the process.
Edited 2 times, most recently on 05/09/09 @ 10:31
barnettbeans
05/09/09 @ 11:15
#36
-2
You buried this comment
Comment below viewing threshold
Show
Can you predict if its better than halo or not?
Bremenacht
05/09/09 @ 13:42
#37
-1
You buried this comment
Comment below viewing threshold
Show
The science of opinion.
Why not save all the cash and just read what people have to say on a forum?
Freek
05/09/09 @ 14:00
#38
-1
You buried this comment
Comment below viewing threshold
Show
Yes, you could hire that company, or you could just ask "What do you think?" when ever the game is previewed and people get some hands on time with it.
byakuya83
05/09/09 @ 14:37
#39
-1
You buried this comment
Comment below viewing threshold
Show
They can better predict the review score when the game is further along in development, haha. Is this even necessary? I don't need any special software to predict a large majority of games reviewed on this very site will score the uncontroversial, middle of the road score, 7/10.
TheLittlestHobo
05/09/09 @ 16:22
#40
-1
You buried this comment
Comment below viewing threshold
Show
I watched the video for Raven Squad and after 30 secs I could tell that it would be a turkey; the voice acting alone was a big red flag. Maybe I could start a new career as a video game quality consultant. I'd be much cheaper and I'll offer 6 bands ( low, low medium, medium, high medium, low high, high) indicated with easy to understand verbal metrics: "LOL!", "Shit!", "Alright", "Not Bad", "Pretty Good", "OMFG!". Going back to Raven Squad I would give that a "LOL!". I will await Eurogamers review and see how I fare, then will be happy to accept any offers.
Dr_Wadd
05/09/09 @ 16:25
#41
+3
You buried this comment
Comment below viewing threshold
Show
I`m not entirely convinced about the claims that you can judge a game fairly within the first minute. Leaving aside the fact that a lot of games have opening cut-scenes that last longer than a minute, I can immediately think of a couple of examples where the first minute entirely failed to be representative of the game as a whole. Mass Effect I found to be a bit of a tedious chore at first and I have to admit I wasn`t overly impressed, but as soon as I got free reign of the Normandy I thoroughly enjoyed the game. The first minute of Ninja Blade is pretty much just a QTE event, so if I were to rate that game on the first minute I'd conclude it was nothing but QTEs. How about games that have flawed control schemes that, at first, appear to cripple the game but can be adapted to? I`m sure I`m going to get flamed for this, but I actually enjoyed the Iron Man game on the 360, but only once I had gotten used to the obtuse control scheme. Once I had that nailed I found it a lot of fun, but the immediate impression was that it was totally uncontrollable.

I`m curious about their biometric testing. From the description it sounds like they are using polygraph type kit, but that can be woefully inaccurate. It's not clear from the description whether they measure things like heart rate while the game is being played, or if it is used to pick up on untruths during the player's feedback. With the former approach I can see some merit, not so for the latter.

I`m generally wary of their claims, but I would have to see a lot more detail in order to make a totally fair judgement. Might have to drop them a line, I need to find a new job and this would seem a top way of combining my programming skills, love of gaming and Psychology PhD.
jonbwfc
05/09/09 @ 16:28
#42
+6
You buried this comment
Comment below viewing threshold
Show
"The tests come from reverse-engineering 154 Edge magazine reviews (more on that later)"

I don't see any more on that. Maybe you should get someone to scientifically analyse your articles?
Dr_Wadd
05/09/09 @ 16:39
#43
0
You buried this comment
Comment below viewing threshold
Show
@ jonbwfc, it's there, but not with a great amount of detail.

Reverse engineering uses patterns of words or phrases from reviews and matches them with scores. "All the high-scoring games talk about certain aspects; all the medium-scoring games talk about certain things; and all the low-scoring games talk about certain things. And there's a very clear mapping between them."
kentmonkey
05/09/09 @ 16:57
#44
0
You buried this comment
Comment below viewing threshold
Show
ShopTo seem to be able to predict the review score and the text! ;o)
RobotRocker
05/09/09 @ 18:35
#45
0
You buried this comment
Comment below viewing threshold
Show
7/10 for PS3 exclusives is on the mark considering you still have to average out for Lair

/stokes the BBQ
Clive Dunn
05/09/09 @ 20:23
#46
0
You buried this comment
Comment below viewing threshold
Show
Yes Dr McAllister thank you. Clearly the publisher has no clue if the title is great or a crock of shite before the reviewers get their hands on it.

Your consultancy fee will be well justified.
bubu.3k
05/09/09 @ 21:44
#47
0
You buried this comment
Comment below viewing threshold
Show
well such a company well intended might actually do good pushing games to a better quality. A company just looking for money might do the same thing buy with a lesser chance of success tho.
davisorle
05/09/09 @ 22:06
#48
-2
You buried this comment
Comment below viewing threshold
Show
So they are researchers who will be able after a long time to tell us if the tittle that will be published a year from now will be bad, ok or good? Wow I should be paid for it then since I do believe I can do that myself from now. Should hire me. Im willing to be paid for that too as a part time job. Ddid those ppl run out of ideas on how to scum or something? lol
Discalceaterabbit
05/09/09 @ 22:46
#49
0
You buried this comment
Comment below viewing threshold
Show
I'm surprised they didn't show the equation,

Review body + game * advertising revenue from publisher = review score
Gearskin
06/09/09 @ 00:30
#50
0
You buried this comment
Comment below viewing threshold
Show
Rule of Thumb - Eurogamer always goes one above or one below the norm. And on occasion makes up a number between one and ten. Like Threesix for example.
Edited 1 times, most recently on 06/09/09 @ 01:31

Comments: 1-50 of 60 in total | next 50 »

Want to comment on this article? Log in, or register!

Get Games.  Download Great PC Games!

X View gallery