Is Bias in NLP Models an Ethical Problem?

It’s easy to get engrossed in all that artificial intelligence, or AI, can offer us. Every machine learning engineer will remember the first time they created a model that helped someone. In practice, most people focus on the benefits brought about by AI technology. But can we responsibly disregard the negative sides of artificial intelligence?
By Boris Delovski • Updated on Aug 3, 2022

It’s easy to get engrossed in all that artificial intelligence, or AI, can offer us. Every machine learning engineer will remember the first time they created a model that helped someone. Even people who aren’t specialized in the field can hardly not get excited when they find an application that solves their problems efficiently (whether they know that AI is used for their solution or not).

And while AI is used in a multitude of fields, deep learning research has flourished in some fields in particular. One of those fields is natural language processing, commonly referred to as NLP. Advancements in NLP have made many positive improvements possible within the field of AI.

In practice, most people focus on the benefits brought about by AI technology. But can we responsibly disregard the negative sides of AI? In this article, I’ll explain how the same biases that exist in our society have crept into our deep learning models, and whether we should do something about it.


What is Bias?

These days, bias is mostly mentioned with a negative connotation. In a society that strives for equality, holding bias toward a person (or group of people) is exactly what we try to avoid. And while avoiding personal bias in our lives makes perfect sense for humans, should we apply the same standards to the machine learning and deep learning models we use?

Not so long ago this wouldn't have even been a valid question. But today the situation is different. With the rise in popularity of AI, machine learning and deep learning models have been introduced into many aspects of our lives. Most people use AI daily, whether they’re aware of it or not, and we regularly use it to shape our own behavior. For example, each time you use Google's search engine you’re using some form of AI. And we often make decisions based on Google search results (ever choose a movie based on Googled movie recommendations?). If the behavior of AI in turn influences our own behavior, are we warranted in applying some of our human standards to machines? And at which point does our AI-influenced “decision” stop being our decision at all?


If you go to a new city and want to visit the best restaurant, you’ll probably look at reviews that are available online. If the reviews are biased—or if the model that displays the reviews is biased—and you decide to base your decision on them, who’s really making the decision: you, or the machine? And to add one more question to our list of ideas to explore: Do we, as designers of AI, introduce bias into our models, or do the models only seem biased because they prefer one solution over another?


How is Bias Introduced to NLP Models?

The deep learning models we use are mostly “black-box” models. In other words, we define a model, compile it, train it using some data, and then later use it to solve a particular problem. As designers, we typically stop there, and explaining why a model makes a given decision is not often tackled. However, this doesn't usually mean that the underlying concepts that drive a model are not known.

After all, every model can be explained using mathematical equations. But what it does mean is that it’s hard to say what a model bases a decision on, so we often don't bother. It's pretty standard practice for supervised learning models, in fact. It's also standard for unsupervised learning models, where the idea of understanding a model's exact decision is even less relevant, since the model is learning in an unsupervised way.


Unfortunately, unsupervised learning is the crux of NLP. Natural language processing models are typically supervised learning models, but the data they use is created using models trained in an unsupervised way. We do this because we can't directly feed text into our models. Instead, we need to convert the text into representations of language that are digestible for our models. Those representations are called word embeddings.


Word embeddings are numerical representations of text data created by unsupervised models. A model trained in an unsupervised way scours large quantities of text and creates vectors to represent the words in the text. Unfortunately, by searching for hidden patterns and using them to create embeddings (which automatically group data), our models are exposed to more than just semantic information. While processing the text, models are also exposed to biases that are not unlike the ones present in human society. The biases then get propagated into our supervised learning models—the same models that are supposed to take in unbiased data to ensure they don't produce biased results.


If you’re skeptical about whether something like this can happen, let’s explore a few of the many examples of racial bias and gender bias that occur in deep learning.


What is Gender Bias?

A 2021 article published by The Brookings Institution’s Artificial Intelligence and Emerging Technology Initiative, part of series entitled “AI and Bias,” discusses research into bias and NLP. The article notes a study performed by the author at Princeton University's Center for Information Technology Policy in 2017.

The study revealed many instances of gender bias in machine learning language applications. While studying machines as they process word embeddings, the researchers found that the machines learned human-like biases from the word associations in their training material. For instance, the machine learning tools were learning that sentences that contained the word "woman" more often also included words such as "kitchen" and "art". On the contrary, sentences that contained "man" more often also contained words such as "science" and "technology". As machine learning picks up on these patterns, these types of gender biases are propagated down the line. And if a supervised training model uses biased embeddings, it will also produce biased results.

Observation and evaluation of gender bias in NLPObservation and evaluation of gender bias in
NLP, Image Source: Sun, Tony, et al. "Mitigating gender bias in natural language processing: Literature review." (2019).

In 2015, instances of gender bias were reported in Google Search results. Adrienne Yapo and Joshua Weiss explore the revelation in their 2018 conference paper “Ethical Implications of Bias in Machine Learning”. In the backlash, Google users discussed their experiences with bias while searching. For instance, mostly images of white males were returned when searching for a word like "CEO". The paper also discusses a study that finds that the Google Search engine showed more ads for high-paying, executive jobs to male users than female.

The best example of gender bias in NLP was found more recently. In May 2020, OpenAI introduced the third generation Generative Pre-trained Transformer, or GPT-3, NLP model. The model was praised as the best NLP model available yet, and able to create language indistinguishable from human speech. However, even the GPT-3 was not immune from gender bias. When prompted to define the gender of certain occupations, the GPT-3 reported that doctors are males and nurses are females.


What is Racial Bias?

In the 2017 Princeton study, racial bias was observed alongside the gender bias. The study found that the negative biases on the internet toward African Americans and the Black community remained present in the model embeddings. When examining their model, the researchers found that compared to traditionally white names, traditionally Black names were more significantly associated with words with a negative connotation.

This type of racial bias was also demonstrated much earlier than 2017. The Brookings article also cites a 2004 study that found that resumes analyzed by machine learning resulted in resumes with African American names receiving 50% fewer callbacks than those that were characterized by the model as white.


Does Bias Exist in Other Fields of Deep Learning?

Researchers have recognized cases of bias not only in NLP models but also in other deep learning models (although they’re most prominent in NLP models). Computer vision is another field of AI with no shortage of highly controversial cases of racial bias.


Photo or image recognition is an example of computer vision. In 2015, Flickr’s image recognition tool brought the company into the headlines when it reportedly tagged a photo of a Black man as an “ape”. In a similar instance, Google Photos was also in the public spotlight when it labeled a Black software developer and his friend as “gorillas” in personal photos.

a black manImage: Flickr's problematic image


Is Bias an Ethical Issue?

Knowing that bias exists, the question becomes how to deal with it in our models. What can we do to make our models as unbiased as possible?

Whether we like it or not, word embeddings represent—to a certain degree—our culture, and using them to represent words leads to accurate models. This means that the biases we notice in our models exist because they exist in our society.

If we choose to create unbiased models, their accuracy will undoubtedly suffer. Trying to represent a biased society using an unbiased model will render the models themselves useless. On the other hand, biased models and our growing dependence on NLP cause a feedback loop that only accentuates the biases that harm our society today.

On paper, it’s an unsolvable dilemma. But maybe we should try to approach it from a different angle. Instead of trying to create unbiased models, we should consider starting to apply the same standards to our models that we apply to ourselves as humans.

Unfortunately, humans are biased, and if not consciously then unconsciously. Even when we think we’re making unbiased decisions, we probably aren’t. As humans, we try to understand and challenge our bias behavior using standards, where an individual's behavior is compared to what is considered acceptable standards of behavior.

So if we can't prevent making biased decisions, we can create standards in advance to audit them. In doing so, we don’t stop models from making biased decisions, but we create a method to deal with those decisions before they harm others. Creating a type of audit mechanism is a possible solution that would allow us to better understand how the behavior of our models might shape public opinion.

Looking ahead, we might even create models that specifically work on trying to solve these bias problems. Today, we have the opportunity to start investing more time and money into developing explainable models. Explainable models could lead to big breakthroughs not only in terms of dealing with biased models but also in deep learning as a whole.


There are likely many solutions to the bias problem in AI—and not all of them are completely unique and revolutionary ideas. So why haven't we already implemented measures to combat bias in deep learning models, and especially in bias-sensitive NLP models? This brings us to the most controversial question of the article: If we have the solution to bias in models, should we implement it?


Ethics vs. Business

In an ideal world, the answer is easy. But in reality, many factors are at play between the practical answer and the ethical one. The most important factor is motivation. Are companies today motivated to reduce bias in their models?

The answer is, unfortunately, a resounding no. The competitive market today doesn't allow for financial “wiggle room”, and as smaller companies fight to stay relevant, they choose to use the models that produce the best results—even if they’re biased. While bigger companies could afford to take some losses, they’re often governed by the idea that financial gain trumps everything else. For them, taking a loss is often not an option.

Let’s look at this from another angle and say motivation to reduce bias isn’t a significant factor. Let’s assume that everyone is motivated to de-bias their models as much as possible. This scenario still has problems that need to be addressed.

Sometimes we want our models to be biased. For example, if a company sells boots for men, it’s perfectly logical to create a model biased towards males. Similarly, if you’re a company selling shoes with high heels, you’ll probably get better results if you bias your model toward females.

The overarching theme here? Everything is relative. In some cases, bias is good. In others, it isn’t. Overall, the harder we try the more we run into convoluted ethical dilemmas. And we haven’t even touched on the ideas around rules of law and morality applying to AI, an object without basic human rights!

The short answer is that there’s not a readily available solution today. The best we can do individually is to try and be less biased ourselves. And because our models mirror our society, a group change in behavior will also change our datasets, and one day might create truly unbiased embeddings from which our models can learn.



AI is already a big part of our lives, and that won’t change in the future. On the contrary, its role in our lives is becoming steadily more significant. AI models choose the news we're exposed to each day and the commercials we'll see first each morning. To disregard the ethical implications of leaving something as influential as AI models unchecked is a mistake, and leads to questionable results like the examples of bias in this article.


At times, censoring models seems like the only logical thing to do. However, I find that using AI as a scapegoat only serves to hide the truth that our society is a biased one. The models we train learn simply and efficiently through examples. So if we see our models making biased decisions, most likely we, as humans, are the example to blame. The models only know to simulate what they've been taught, so with models as our mirror we see the flaws in our own reflection that we don't want to see.

The real solution to dealing with bias and ethical problems in our models is to solve those ethical problems within ourselves. Fixing ourselves keeps the accuracy of our models at a high level while correcting for bias. Now is the time to begin leading by example for a better future with AI, instead of covering up our past mistakes.

Boris Delovski

Data Science Trainer

Boris Delovski

Boris is a data science trainer and consultant who is passionate about sharing his knowledge with others.

Before Edlitera, Boris applied his skills in several industries, including neuroimaging and metallurgy, using data science and deep learning to analyze images.