Paul DePodesta revolutionized the world of baseball when he suggested using strictly quantitative data to select players for a team. Forget a player’s confidence level, his pitching style, and his past trades. Forget his body-mass index and the fact that he has an ugly girlfriend (which, according to scouts in “Moneyball” 1, means the guy has no confidence). Instead, focus only on one algorithmically curated number: the player’s quantitative value.
DePodesta’s fictionalized character in the 2011 film “Moneyball,” Peter Brand, saw numerical value in players that no one else did. And it worked. He and Billy Beane, the Oakland Athletics team manager, put together a chaotic team for the A’s with an incredibly low budget and set a record by winning twenty games in a row.
“People are overlooked for a variety of biased reasons and perceived flaws — age, appearance, personality,” Brand says in his pitch to Beane, explaining mathematician Bill James’s theory of sabermetrics: combining baseball and statistics. This empirical approach ignores a player’s personality, whether that be his horrible attitude at the plate or his stunning passion to improve himself every day. Sabermetrics would have turned away all those inexperienced, promising backyard players that they show in inspirational baseball movies. But the game of baseball for players is already cold — they wake up one morning wearing a Giants jersey only to find themselves flying across the country in a Cubs jersey that same night.
A purely numerical approach to player recruitment makes sense given the high stakes of Major League Baseball. But what if we brought this same hiring approach to corporate America? What if hiring employees becomes an algorithmic process designed to select the best team solely based on numbers? What if your next interviewer was a computer?
The ‘Computerized Interview’
Given the increase in four-year college degrees and internships in recent decades, the number of qualified candidates for any job opening is on the rise. Forced to deal with a growing applicant pool, companies schedule up to three interviews per candidate and often fail to reach all candidates. To reach more candidates and speed up the interview process, recruiters have turned to computerized interviews.
Most companies, such as Goldman Sachs, Hilton, Target, and Amazon, facilitate computerized interviews through prompt-based software, where candidates receive a question and record their answers. 2 While initially, human evaluators reviewed these responses, some companies have recently integrated AI graders that score according to a behavioral rubric. 3 At Great Western Bank, a regional bank based in South Dakota, applicants walk through an interactive assessment where they handle customers’ requests for money withdrawals, account openings, and more. They interact with difficult requests and angry customers. Their responses are recorded and scored later by a bank official. The Bank reported that not only did the process reduce “useless” personal interviewing, but also ensured that those hired were less likely to be fired or quit within 90 days. The creative nature of the interview gave candidates a more accurate glimpse of the job for which they were applying.4 Hilton credited the process to shrinking their hiring process from six weeks to five days. 5
Introducing AI Scoring
Initially, AI scoring for computerized interviews was introduced as another method to save time for recruiters by directly presenting them with an evaluation of candidates. Platforms such as HireVue have taken it a step further, claiming that automated scoring also reduces implicit biases that humans may act on when hiring. Their mission is to closely follow the 1978 Uniform Guidelines on Employee Selection and “aid in the achievement of equal opportunity for everyone regardless of gender, ethnicity, age, or disability status.”
Humans have prejudices apparent only after looking at hiring statistics. According to Freakonomics and a field experiment on labor market discrimination, a white-sounding name on a resume is 50% more likely to get a callback than a black-sounding name with an identical resume. 6 This is analogous to older baseball players being overlooked by baseball scouts as “out of their prime” and slow at running the bases, regardless of their true performance.
HireVue wishes to eliminate such bias in hiring via fair algorithms…but what constitutes a “fair” algorithm? Nathan Mondragon and his team of Industrial-Organizational psychologists at HireVue believe that a structured, algorithmic interview is one of the best and fairest predictors of job success.7 They pledge to use diverse datasets, hire a diverse team, and consistently review their algorithms for bias. Mondragon offered some insight into the algorithm, stating that a 30-minute interview with around six questions can provide 500,000 data points used in the calculation of a person’s score. For instance, the AI may assess how a candidate’s face moves to determine their level of excitement about a task.
“Those ‘Facial Action Units,’ Mondragon said, can make up 29 percent of a person’s score; the words they say and the ‘audio features’ of their voice, like their tone, make up the rest.
False Promises
The technology hasn’t been around long enough for its statistics to be published, but problems have already arisen. In 2019, the nonprofit Electronic Privacy Information Center filed a complaint against HireVue, claiming that their attempt to have AI score video interviews led to “unfair trade practices.” In response, HireVue banned facial expression monitoring during the interview process. 9
Facial monitoring bias is already a hot topic in the news as we know that facial recognition is accurate less frequently for people of color than it is for their white counterparts. 10 Several other components of HireVue’s technology have been marked as features that could use improvement (understanding accents, response length, word diversity, etc). The allegations of bias aren’t isolated to HireVue; in fact, they’re criticisms levied at scoring algorithms across sectors. Critics have long questioned whether bias-free scoring is feasible at this stage of technological development. In 2015, Amazon was caught in a scandal when its intelligent hiring program unfairly rated female job applicants. Due to the male-centered data of 10-years of employee history the algorithm trained on, the program penalized resumes with the word “women’s” and applicants who attended all-women’s colleges. “In effect, Amazon’s system taught itself that male candidates were preferable… Amazon edited the programs to make them neutral to these particular terms. But that was no guarantee that the machines would not devise other ways of sorting candidates that could prove discriminatory, the people said.” 11
Black Box
Although hiring algorithms have become more diffuse, they remain a black box. Technologists can control for bias in one area and accidentally develop a new bias in the algorithm. AI scorers deprive candidates of crucial feedback, as candidates cannot ask for areas of improvement after an interview. Moreover, even when companies detect bias, they’re reluctant to share their dataset in fear of divulging trade secrets.
“HireVue offers only the most limited peek into its interview algorithms, both to protect its trade secrets and because the company doesn’t always know how the system decides on who gets labeled a ‘future top performer.’” 12
If we continue to use these black box algorithms, our best path is a completely representative dataset collected only of factors that judge an individual’s capability, not his or her background or identity. That way, no matter which factors the algorithm latches onto, it will be fair.
But is this what we need? To hire employees without understanding where they come from and who they are? Are we moving towards a reality where one’s abilities are all that matter, and not one’s determination, enthusiasm, and grit? Will we automatically choose the employee with a stellar resume who came from money over the candidate who rose from poverty to success simply because he can speak more about his past work experiences? The algorithms take into account enthusiasm and word syntax, but these factors do not encompass one’s overall life story, passion, or true love for the job. Enthusiasm can be faked and rhetoric can disguise a lack of expertise. Ultimately, a computer cannot yet connect with a human on a non-quantitative level. That personal connection is what so many people rely on when interviewing.
“I’d personally do better when I can see the person I’m interacting with,” shared a recent interviewee for a summer internship which used HireVue, “If you’re awkward in front of the camera, that doesn’t mean you’ll be awkward in the workplace or in front of an interviewer.” The applicant noted that she felt slightly uncomfortable that she had to stare back at herself instead of being able to watch her interviewer’s reactions to her responses.
The Future of Computerized Interviews
The promise of AI scoring is incredibly appealing: humans are biased, so if they are taken out of the equation, the process will no longer remain biased. But, that’s not quite how it works. The historical data fed into the system and the algorithms written by humans mirror the implicit biases prevalent in our society. Companies such as HireVue do their best to ensure their algorithms and data are unbiased, but that technology simply does not exist yet. To this day, facial recognition algorithms often misidentify black faces. Courtside algorithms assign higher recidivism scores to people of color. Women are penalized in the hiring processes for being women. Candidates rejected by computerized hiring systems do not know why. We aren’t at the point where we can trust technology to be perfectly fair, and yet HireVue interviewed more than 5.6 million people around the world in 2020. 13
This number will continue to rise. More companies will turn to automated hiring either out of convenience or in pursuit of a more ethical method of evaluation. Data scientists and psychologists should continue to curate AI, but in the meantime, society should have a broader conversation about which aspects of our lives we should automate. Data collection — automate it. Product testing — automate it. Medical diagnoses — automate it. But hiring employees? For machines to decide whether applicants can thrive at a company, computers need to develop the cognition to understand humans, their stories, passions, and aspirations. Until then, we can use them to augment, but not replace human interviewers.
Computers may augment human interviews by facilitating immersive assessments which can be objectively graded. A computer program could assess a cashier’s ability to make change or a grocery bagger’s ability to pack a bag. For more in-depth behavioral interviews, prompting software could be used for biographical questions, allowing applicants to quickly recall details from their resumes while saving in-person interview time for more story-based questions.
Secondly, in order to reduce the responsibility of the AI, automated scorers should not make definitive decisions. For the most part, the technology can be used for preliminary screening to weed out unfit applicants who fail to complete the interview. For companies that hold two to three interviews per candidate, the computerized interview can handle the first round and provide a preliminary judgement. This still cuts down time and cost.
Finally and arguably most importantly, there needs to be transparency. Even if the technology is 100% fair, until there is transparency, there will always be doubts in the mind of interviewees. Companies should communicate feedback with interviewees, perhaps via a rubric scored by the algorithm. Last year, Illinois’s governor signed a first-in-the-nation law requiring employers using AI-based hiring to explain “how the artificial intelligence works and what general types of characteristics it uses to evaluate applicants” in addition to obtaining consent. 14
Taking a “Moneyball” approach to hiring may make it easier to navigate large pools of applicants. But, the lack of transparency and allegations of bias associated with these tools make it imperative that we implement them carefully. Until we fully understand these technologies, computerized interview tools should augment, improve, and explain.