Behind the code: How AI is perpetuating racism, sexism, and LGBTphobias
Artificial Intelligence is the simulation of human intelligence by computer systems, through such things as learning, self-correcting, or even playing games. To be able to do that, AI systems are fed a massive amount of data in which they identify patterns to help their decision-making.
Even though we’ve been witnessing the growing presence of AI systems in our day-to-day life for years now (think Siri, Alexa, and even simply Google), the most recent ones, namely ChatGPT, caused a stir for their sophistication. The latest language learning model released by OpenAI, ChatGPT has been rumored to be integrated into Microsoft’s online searching model through Bing. With such an impact on our online searches, learning, and thus perception, the concern about biases in algorithms is growing bigger.
Biased algorithms: More than just an exception, a standard
Concerns about biased algorithms have existed since AI’s emergence in the 1970s and have been proven right ever since. ChatGPT’s predecessor, GPT-3, has been regularly found promoting hate speech, making islamophobic jokes, and even racial profiling. The same thing could be witnessed with almost all AI systems: Ask Delphi, for example, a machine-learning model that was supposed to give ethical advice, very quickly claimed to users that “Being straight is more morally acceptable than being gay”.
To prevent the same thing from happening to ChatGPT, OpenAI found a solution: build AIs that could detect toxic language like hate speech and help remove it from the AI’s data. But what seemed like a good idea in theory had a horrible impact in practice: To build those AIs, OpenAI hired Kenyan workers, paid between around $1.32 and $2 per hour, to review horrible content all day long. The task was so traumatic that Sama, the company outsourcing the Kenyan Workers, canceled its contract 8 months earlier than planned.
In addition to the harmful foundation and outcomes of this practice, it is only helping with AI’s biases on a surface level: discriminatory practices lie in what shouldn’t be told but also in what should be seen, yet isn’t.
The big absentees of AI databases
AI technology can only be as good and fair as the data available for it is. The datasets AI are trained on, if large (ChatGPT is trained on more than 570GB of data), are more often than not incomplete and biased, reproducing human bigotry and lack of unprejudiced research when it comes to women, people of color, and queer individuals.
This same observation can be made through the usage of AI “art” generators, like DALL·E or Artbreeder. When asked to generate the evolution of human visual art, AI will show you almost exclusively Western art. As professor Amelia Winger-Bearskin points out, “All AI is only backward-looking”, and can’t be improved without a better database.
On top of this bias in research and thus data that is fed to the AIs, there is also the bias of researchers, of those who work in the field, which is extremely male and white-dominated. If those in higher positions are always the same, how can we expect a real change and consideration of minorities and marginalized groups in AI?
But increasing the amount and diversity of data is not a solution in itself. It’s the AI models that need to be rebuilt from scratch. If AI is growing to be an even bigger part of our lives, we need to support different models. Initiatives like WinoQueer, an AI trained by Virginia Felkner and her team on tweets and articles that include examples of queer people talking about themselves, open new paths to follow. A different paradigm is possible and needs to be supported.