Sramana Mitra: I’m listening and thinking about the zip code example that you took me through just a few minutes ago. There is going to be bias if you apply that. It’s a lot better to have a job close to your home.
There will be a certain amount of advantage that will accrue to people who live close to those retail stores and perhaps there are people who couldn’t afford to live close. Wouldn’t your algorithm be biased against those people?
Stuart Nisbet: You are exactly right. For every recommendation that we make, we do adverse impact assessments after the fact to check. Even though we didn’t use age, ethnicity, and gender, we run impact assessments to ask if there were any adverse impacts based on the recommendation that was made.
We do the same thing for our cognitive or personality assessment. If we are looking for whether someone is going to be a good cashier and we are asking questions around whether they enjoy the interactions with people or the solitary work environment.
In ethnicity, some differences can come in that are unintentional, but you can find those through the adverse impact. To your point, it’s one of the things that we are spending a tremendous amount of time on even if we just look at zip codes.
There are zip codes that are affluent and there are zip codes that are of lower incomes. There are also disproportionate numbers of ethnic roots in certain zip codes.
Just to be clear, we haven’t made recommendations based on the distance to travel but that is the type of thing that we are trying to pioneer – to ask whether there is a difference in my willingness to work as a 56-year-old white male in a job that is 20 miles away compared to two miles away.
We don’t know the answer to that yet, but these are the questions that we are asking through data science. The key to greater retention is explainability and trust. There are a lot of companies that have proprietary algorithms and approaches to what they are doing and they don’t want to share their secret sauce or the way that they are doing something.
The value that I think we bring is that we are using open source software. We would share all of the algorithms that we are using with anyone. We would share the approaches that we are using and the results that we are delivering.
What we bring is hundreds of millions of records of data that our clients have entrusted us with and that we use to build these models. Note that there is no personally identifiable information and company information. We trained the model based on a single company’s information and we don’t comingle that data. It is completely open and transparent. We wouldn’t share our client’s application data of course.
I’m speaking to a lot of companies that have done a wonderful job, for example, Amazon. There was a famous case where they were using machine learning to try to hire developers. They looked at all the applications and the people they had hired and they started making recommendations much like how Amazon makes recommendations for buying decisions. It was quite famous.
They stopped it very quickly because it showed an institutional bias towards males. After all, they always hired males. When they trained their algorithms, they continued to only hire males. They had an opportunity to remove those variables from the training and only look at education, experience, and other variables. It would have given an opportunity to remove the bias.
This segment is part 5 in the series : Thought Leaders in Artificial Intelligence: Stuart Nisbet, Chief Data Scientist, Cadient Talent
1 2 3 4 5 6