Get that second (and third) opinion on a candidate before hiring
One hundred years ago, British statistician Francis Galton was on a mission to prove that intelligence only exists in a few elite. He seized the opportunity to prove it when 800 townsfolk, from farmers to factory workers, were placing bets on the weight of an ox at a local fair. Unsurprisingly no one individual guessed the exact weight of the ox, 1,198 pounds. But the collective average of the group? 1,197 pounds, an error of just 0.08% Ever the empiricist, Galton reluctantly concluded that crowds could be right: “the result seems more creditable to the trustworthiness of a democratic judgment than might have been expected.”
If a crowd can accurately guess the weight of an ox, can a group of interviewers select the best candidate to hire? Against the backdrop of increasing awareness of the unconscious biases that affect how we judge others, we at the Behavioural Insights Team were keen to find out if the wisdom of crowds might offer a solution.
In a simple online study, we asked 398 reviewers, sourced through Mechanical Turk, to independently rate interview responses we wrote from four different hypothetical, unnamed candidates to the generic recruiting question: “Tell me about a time when you used your initiative to resolve a difficult situation?” We provided the reviewers with some guidelines on what a “good” answer should include, similar to how a structured interview is set up.
Collectively the reviewers’ combined ratings quickly coalesced around a single “best” response, making for a clear winning candidate. But most organizations can’t afford to ask hundreds of people to help them select a candidate. The critical question is: at what point does the crowd become wise?
We took the data from this experiment and ran statistical simulations to estimate the probability that a given group could correctly select the best candidate. We created 1,000 combinations of reviewers in teams of different sizes, ranging from one to seven people. We then pooled these estimates by the size of the group and averaged the chances of selecting the right candidate.
The graph below illustrates the point: with more people, you are more likely to correctly identify the best person. Or, put another way, with more people you’re less likely to accidentally pass over your best candidate.
The blue line in the graph shows that with ‘easy’ cases (where there’s a significant gap in quality between the best and the second best responses), there’s a 16% chance that a person working on their own will select the wrong person. With a group of three, however, that falls to 6%, and by five reviewers, you can be pretty much certain they won’t make a mistake (1%).
But what about when people disagree more over what ‘good’ looks like? That’s what the green line shows (we artificially increased the variance in scores). Where reviewers disagree more on a given candidate you need to pool more opinions to gain judgmental accuracy. Moving from one to three reviewers has a big impact: a group halves their chance of getting it wrong, going from 33% to 15%.
Finally, what about situations where candidates are really similar, and it’s hard to distinguish between them? The yellow line reveals what we found when we tested the crowd’s ability to separate the second and third best candidates (whose responses were similarly graded). We found crowds are even more important here, and that a single interviewer working on their own can be trusted less than half of the time.
So, what does this mean for a real world workplace? We recommend getting at least three reviewers to vet each candidate. This can significantly improve the odds of making the best hire.
Kate Glazebrook is a Principal Advisor at the Behavioural Insights Team, a social purpose company co-owned by the UK Cabinet Office and Nesta with the aim of applying behavioural science to social problems. In an effort to use science to help organizations recruit better, they’ve created Applied - an online people platform that builds in the wisdom of crowds as well as a bunch of other behaviourally-informed approaches to recruitment.