Not So Fast:
Can AI really predict when an employee will quit?

“Not So Fast” is an occasional blog series in which CultureIQ experts slice and dice claims around employee research. This is our first “Not So Fast” report.

By Diane Daum

The CNBC headline on Sept. 10 delivered a very bold claim: “This algorithm can predict when a worker is about to quit — here’s how.”  The CNBC article was based on a Harvard Business Review report by the algorithm’s creators, who claim their work shows “that by using big data, firms can track indicators of turnover propensity and identify employees who may be at an  elevated risk of leaving the organization.”

They go on to say that knowing who is going to quit allows organizations to proactively retain the employee, rather than simply learning about what led them to leave through an employee survey or exit interview.

There are several reasons to be wary of the usefulness of their algorithm, but the first one is the math. While we don’t think the authors’ intentions were to mislead, our reaction when we saw the article was that the algorithm efficacy could easily be misunderstood.  The authors say that “those labeled most likely to leave by the algorithm are 63% more likely to change jobs, as compared to those who were unlikely to be receptive.”  Let’s think about this from a base rate perspective.  Even if we assume that those most likely to leave are 63% more likely than ANYONE else (not just the “unlikely” group), which is a more generous treatment of their findings, consider how that applies to a typical organization.

Can Data Predict Who Will Leave?

Estimates vary as to the average annual turnover rate across all types of organizations and industries in the US, but most put it in the neighborhood of 20% annually.  That means in the 3-month period that the authors were tracking, we might expect 5% of people to leave in a typical organization.  So, say we have 100 people.  Without knowing anything else about them, we would expect that each one of them has a

Not So Fast-employee leaving

5% chance of exiting.  The authors don’t tell us what percentage of people were in the “most likely” bucket, but most models would forecast a number of “leavers” that is consistent with the overall turnover rate (e.g., if you are forecasting turnover for the organization to be 5%, the probability of leaving across all employees would generally be consistent with that number).

If we consider that those tagged as “most likely” leave at 1.63 times the rate of everyone else, while holding the 5% base rate constant, that would mean that about 95% of the people would have a 4.85% chance of leaving, and those chosen by the algorithm would (on average) have a 7.9% chance of leaving. Would you lavish organizational resources on your “high turnover risk” employees if you knew they were actually over 92% likely to stay?  And if you would, consider that by the time 3 months is over, less than 1 (.4) of the people tagged as “most likely” will have left, but nearly 5 (4.6 people) in the group you ignored are now on their way out the door.

“But wait”, you say, “I work in a high turnover industry where 25% of people would be expected to leave in a single quarter”.  If we apply the same logic here, then of our 100 people, 25 of them will have been identified as “most likely to leave”, and will leave at a rate of 35.2%, while the remaining 75 would be predicted to leave at a rate of 21.6%.  So, if you were to predict that people in your “most likely to leave” group were going to exit, you would still be wrong almost two-thirds of the time.  And the problem of the “false negatives” still applies. You would likely lose about 9 people from your “highly likely” group, but 16 from the group not predicted to leave.

What Lurks Beyond the Math

If the math doesn’t convince you, there are other issues. The authors indicate that the system provides “real time” information on who might be leaving, by considering “turnover shocks” and factors that lead to “low job embeddedness” (which are, admittedly, some of the best predictors in the academic research on turnover).  Some of the examples they give are organizational events (mergers and acquisitions, poor business results), which makes sense when conducting research that spans many organizations, but which is of no value when trying to predict which individuals within an organization are most likely to leave.  In these cases, everyone in the organization would be at heightened risk, and the organization would likely have scant resources to cater to all, so the algorithm would do little for them that they couldn’t do by consulting performance and high potential records to determine who they most want to retain.

Other examples they give are more personal – the birth of a child, or an outside job offer.  Indeed, when we recently asked a sample of people who had exited a job in the last 6 months if there was a precipitating event, most of them mentioned a more personal reason, such as relationships at work (e.g., an incident with boss or coworkers), personal or family reasons, lack of ethics or respect in the workplace, and hearing of other opportunities.  Does your HRIS system update in real time when these events occur?  The authors argue that you could gather this information publicly, but this takes resources and is hardly foolproof.

Beware the Other Kind of AI

In addition, using some types of personal information (gender, for instance) could land you in legal hot water (where you would learn that AI also stands for “Adverse Impact” against protected groups). Employers should not be making decisions about people based on their gender, marital status, or parental status. Another predictor the authors used was “number of previous jobs.” If you hire a lot of “job hoppers” into your organization, it seems like your own choices are the real issue.  Why hire people with a track record of frequent job changes and then turn around and classify them as “retention risks”?

And there are some risks to the people identified as “most likely” to leave.  As the authors point out, you could use this information “for good,” by approaching those at risk and working to retain them.  However, there is also the chance that the information could be used against people.  Perhaps an advancement opportunity or an opportunity to work with a high-profile client would be withheld from someone who was wrongly identified as a “short timer” or worse yet, they could be perceived as disloyal and fired.  In addition, to the extent that employees get wind that you are collecting a lot of personal information about them by scraping websites, this will likely engender mistrust.

Human Problem, Human Solution

So, what can organizations do instead?  First, we acknowledge that embeddedness and shocks are important – but using them to apply predictions to individuals seems premature, and solves for the case, but not for the problem, which will continue to impact other employees.  For example, if you use embeddedness to predict that “Tom in Finance seems to be at risk,” but you don’t solve the larger problem (newcomers who relocated are having a hard time making new connections), you will continue to have others leave for the same reason.  Instead, why not organize some events to help all new employees make connections?

Another thing organizations can do is to empower managers to take actions to retain people when they think they are at risk, such as providing spot bonuses or a paying for special development opportunities. Managers are often aware of many of the personal events mentioned in the article that may trigger thoughts of leaving, before these events become part of any data source.  Another is to survey your employees.  The authors mention that surveys do not give managers a real time picture of who might be considering leaving.  Yet surveys can surface issues with the organizational culture that lead to turnover,  and give the organization an opportunity to address them well before people head for the door.

At CultureIQ, one of our values is “respect data, but make human decisions”, and that applies here.  We certainly encourage continued research that tries to improve our ability to predict an important outcome such as turnover, and we are in favor of innovative and thoughtful applications of AI and big data.  However, when there is a 92% chance that your judgment about an individual person is wrong, we think that “human decisions” are still a better gamble.

–Diane Daum is a Principal Strategist for CultureIQ. Scott Young, CultureIQ Managing Director, Client Solutions, also contributed this article.

Related insights from CultureIQ:

Infusing culture throughout the employee lifecycle

Blueprint to successful onboarding

5 languages of love to show employees