Experts say that bias is one of the biggest problems facing the development of artificial intelligence. When a data set reflects systemic discrimination and bias in the real world, that bias gets encoded into an automated system–which can have dire consequences, determining who gets a job, who gets a loan, and even who goes to prison.
Yet it can be hard to tell when a data set is biased, especially when these systems are built by homogenous teams mostly consisting of white men. Even the existing tools that are meant to test algorithms can be biased. Take what’s known as a “benchmark data set,” basically a bunch of data that is used to assess an AI’s accuracy. Two common benchmark data sets used to test facial recognition systems, known as IJB-A and Adience, are actually composed of 79.6% and 86.2% light-skinned faces, which means that these benchmarks don’t test the accuracy of the algorithm for all kinds of faces with the same kind of rigor.