How to Detect Emotions in Images Using Python

It's easy to perform emotion detection in images with the pre-built Python Libraries.
By Boris Delovski • Updated on Mar 2, 2023
blog image

In the previous article of this series "How Emotional Artificial Intelligence Can Improve Education," I covered the following topics:


  • What is affective computing?
  • What is emotional artificial intelligence?
  • What can you analyze to get a better understanding of how someone feels?
  • How can you apply the aforementioned in the education industry?

One of the easiest, and yet also the most effective ways of analyzing how people feel is looking at their facial expressions. Most of the time, our face best describes how we feel in a particular moment. This means that emotion recognition is a simple multiclass classification problem. You need to analyze a person's face and put it in a particular class, where each class represents a particular emotion.

Analyzing faces is not always enough to gauge how somebody feels. Humans often try to hide how they feel. This can sometimes lead to misleading results if only emotion recognition in images is performed. However, in combination with other techniques (such as body language in images, or voice analysis in videos), you can get a pretty solid idea of how somebody feels.

Let's demonstrate how easy it is to perform emotion detection in images. You can use pre-built libraries that will allow you to easily analyze faces and get the results you want very quickly without using too much code.


What is Python's DeepFace Library?

The first library I am going to talk about is the DeepFace library. It is probably the most popular library for performing emotion analysis and similar tasks. DeepFace is an open-source project licensed under the MIT license. This means users can freely use, modify and distribute the library both for non-commercial and commercial purposes. This makes it perfect for anybody who might want to implement it in their practices. It serves as a framework for using already-trained deep learning models to perform image and video analysis. It offers much more than just emotion detection, even if that is what interests you most. 

The library uses pre-trained SOTA models (state-of-the-art models) in the background. SOTA models are those models that currently achieve the best possible results for some particular task on a set of benchmark datasets.

The models DeepFace uses in the background are:


  • VGG-Face
  • Google FaceNet
  • OpenFace
  • Facebook DeepFace
  • DeepID
  • ArcFace
  • Dlib

These models are so good that they have demonstrated that they can analyze images of faces (and even videos) at a level that surpasses what is humanly possible. The face recognition pipeline of DeepFace consists of four stages: 


  • Detection 
  • Alignment
  • Representation 
  • Verification

Let's demonstrate how DeepFace performs all of the aforementioned tasks using only one line of code.

Article continues below


How to Use the DeepFace Library

First, you need to install the library. Since it is published in the Python Package Index (PyPI), the easiest way to install it is to pip install: 

pip install deepface

This will automatically install everything you need to use this library.

Using the library is extremely simple. After you import the package, you just need to input an image. The library will give you a detailed analysis of that image.

Here's how DeepFace works on the following image:

 smiling woman

Image Source: Paul Ekman Group, The Science of Smiling. 

To start, import what you need:

from deepface import DeepFace

Then you can analyze the face present in the image:

face_analysis = DeepFace.analyze(img_path = "happy_face_woman.png")

And this is all there is to it if you don't want to customize the process of analysis too much. Running the code above will give you the following result:

{'emotion': {'angry': 4.476726101312781e-06,
  'disgust': 1.6381327493892675e-06,
  'fear': 0.0001274320160076828,
  'happy': 99.06393880033129,
  'sad': 0.02293923016927273,
  'surprise': 3.946005002585829e-06,
  'neutral': 0.9129819073070232},
 'dominant_emotion': 'happy',
 'region': {'x': 77, 'y': 104, 'w': 163, 'h': 163},
 'age': 31,
 'gender': 'Woman',
 'race': {'asian': 2.069193683564663,
  'indian': 7.127643376588821,
  'black': 0.4860048647969961,
  'white': 24.476712942123413,
  'middle eastern': 17.554299533367157,
  'latino hispanic': 48.28614890575409},
 'dominant_race': 'latino hispanic'}


As you can see, you are given a very detailed analysis. It gives you the following information:


  • Percentages for each of the 7 basic human emotions, and which is the dominant one
  • The bounding box coordinates for the face in the image with the region parameter
  • The predicted age of the person
  • The predicted gender of the person
  • The predicted race of the person (with percentages for different races)

Since the result you get is a dictionary, you can easily access different parts of it by referencing the keys of the dictionary:


The code above gives you the following result:

{'angry': 4.476726101312781e-06, 'disgust': 1.6381327493892675e-06, 'fear': 0.0001274320160076828, 'happy': 99.06393880033129, 'sad': 0.02293923016927273, 'surprise': 3.946005002585829e-06, 'neutral': 0.9129819073070232}

DeepFace will also work with grayscale images.

Let's take a look at an example of analyzing the emotions present in the following grayscale image:

 smiling woman

Image Source: Paul Ekman Group, The Science of Smiling. 

To analyze the image above using DeepFace, you will use the same code you used for the image that was in color:

face_analysis_2 = DeepFace.analyze(img_path="happy_face_grayscale.png")


This will lead to the following result:

{'angry': 2.8718812601394677e-18, 'disgust': 2.5457508031498726e-35, 'fear': 1.3584258743615688e-23, 'happy': 100.0, 'sad': 1.4448950023722881e-16, 'surprise': 1.16495389723692e-09, 'neutral': 4.1699252051330404e-06}



While DeepFace might seem like the best solution in all cases, there is a caveat. Since the image needs to go through all of the stages during the pipeline, it can sometimes get "stuck" at a stage.

Let's take a look at this image:

a happy woman in black and white

Image Source: FER-2013 dataset. 

This is one of the images from the FER (Face Emotion Recognition), a dataset of 48x48 pixel images representing faces showing different emotions. DeepFace will run into a problem at the face detection part of the pipeline and throw out the following error:

# ValueError: Face could not be detected. Please confirm that the picture is a face photo or consider to set enforce_detection param to False.

In this case, there are two ways to solve this problem:


  • Follow what DeepFace suggests as a solution and set the enforce_detection 
    parameter to False OR
  • Use some other library

Following the suggestion given from the DeepFace library, you can run the following:

face_analysis_3 = DeepFace.analyze(img_path = "FER_dataset_image.png", enforce_detection=False)


This gives you the following result:

{'angry': 0.0008810167471331748, 'disgust': 8.797318595862103e-12, 'fear': 8.577033639407524e-06, 'happy': 99.99908804888058, 'sad': 4.79596746481186e-07, 'surprise': 6.102041458345537e-08, 'neutral': 2.3184728053760715e-05}


The result seems to be good, so this is a valid option. However, another option for cases like these is to use another library. One such library is the FER library.


What is Python's FER Library?

The Facial Expression Recognition (FER) library is an open-source library created and maintained by Justin Shenk, co-founder of VisioLab, that allows you to perform emotion recognition on both images and videos with just a few lines of code. It is not as versatile as the DeepFace library. You can only use it for emotion recognition. Nonetheless, it is still very powerful, and in this case practical since it works out-of-the-box, even with images of low quality.  


The library combines deep learning with OpenCV functionalities to perform emotion recognition in images. The way it works is pretty simple. You pass an image to the FER constructor, which gets initialized using either the OpenCV Haar Cascade classifier or a multi cascade convolutional network (MTCNN). As a result, you get an array of values assigned to each of the aforementioned basic emotions, in percentages between 0 and 1. If you want, you can also access just the value of the dominant emotion.

I'll demonstrate how analyzing emotions using FER works.


How to Use the FER library

FER is also available on PyPI, which means that you can install it very easily by running the following code:

pip install fer

After installing, the first thing you will do is import what you need:

from fer import FER
import cv2

Now you can define your emotion detector.

For this example let's use MTCNN. If you set the argument mtcnn to True, the detector will use the MTCNN.

If you set it to False, it will use the Haar Cascade classifier.

emotion_detector = FER(mtcnn=True)

You can now define the image that you want to analyze. Let's use an image that has multiple faces in it to demonstrate that FER can analyze multiple faces at once.

different facial expressions

Image Source: Paul Ekman Group, The Science of Smiling.  

test_img = cv2.imread("multiple_faces.png")
analysis = emotion_detector.detect_emotions(test_img)

The result you get is a list of dictionaries, where each dictionary represents one face. It provides you with bounding box coordinates and an analysis of the emotions of the people in the images: 

[{'box': (778, 133, 163, 163),
  'emotions': {'angry': 0.0,
   'disgust': 0.0,
   'fear': 0.0,
   'happy': 0.66,
   'sad': 0.01,
   'surprise': 0.0,
   'neutral': 0.32}},
 {'box': (467, 158, 165, 165),
  'emotions': {'angry': 0.0,
   'disgust': 0.0,
   'fear': 0.0,
   'happy': 1.0,
   'sad': 0.0,
   'surprise': 0.0,
   'neutral': 0.0}},
 {'box': (149, 437, 128, 128),
  'emotions': {'angry': 0.0,
   'disgust': 0.0,
   'fear': 0.0,
   'happy': 1.0,
   'sad': 0.0,
   'surprise': 0.0,
   'neutral': 0.0}},
 {'box': (468, 443, 152, 152),
  'emotions': {'angry': 0.03,
   'disgust': 0.01,
   'fear': 0.01,
   'happy': 0.85,
   'sad': 0.01,
   'surprise': 0.02,
   'neutral': 0.07}},
 {'box': (783, 421, 164, 164),
  'emotions': {'angry': 0.0,
   'disgust': 0.0,
   'fear': 0.0,
   'happy': 0.98,
   'sad': 0.0,
   'surprise': 0.0,
   'neutral': 0.02}},
 {'box': (163, 123, 146, 146),
  'emotions': {'angry': 0.0,
   'disgust': 0.0,
   'fear': 0.0,
   'happy': 1.0,
   'sad': 0.0,
   'surprise': 0.0,
   'neutral': 0.0}}]


Of course, FER would work even if the image you inputted was grayscale. 


How to Use FER to Analyze Low-Quality Images

While DeepFace has problems working with low-quality images (at least out-of-the-box), FER doesn't.

I'll demonstrate that on the low-quality image from before:

test_img_low_quality= cv2.imread("FER_dataset_image.png")
analysis = emotion_detector.detect_emotions(test_img_low_quality)

Running the code above will give you the following result:

[{'box': (0, 0, 45, 45),
  'emotions': {'angry': 0.0,
   'disgust': 0.0,
   'fear': 0.0,
   'happy': 0.9,
   'sad': 0.0,
   'surprise': 0.09,
   'neutral': 0.0}}]

This demonstrates how well FER performs on low-quality images. You can also access only the most dominant emotion by changing the code just a little bit:

dominant_emotion, emotion_score = emotion_detector.top_emotion(test_img_low_quality)
print(dominant_emotion, emotion_score)

This will give you the following result:

happy 0.9

Emotion recognition is a field that keeps advancing at tremendous speed. One of the most important aspects of analyzing emotions is analyzing human faces. While the technology is still not perfect, advanced emotion recognition models outperform humans when it comes to emotion recognition in images. Of course, there are certain limitations to models, such as the one I demonstrated when working with the DeepFace library. However, most of the time, the results you get are pretty reliable.

While it is possible to build a custom model, for over 90% of users that is not necessary. The libraries that do exist are open-source, can be used for both commercial and non-commercial purposes, and allow users to perform emotion recognition using just a few lines of code.

Probably the most popular libraries for performing emotion recognition are DeepFace and FER. In this article, I demonstrated how to use both of them, and I also pointed out the advantages and disadvantages of each of the two libraries. In tandem, they form the perfect duo for performing emotion recognition.

In the next article of this series, I will demonstrate how to perform emotion recognition on videos. I will try to predict whether a student is interested in a particular lecture or not. This could in the future become a very powerful tool that helps teachers, professors, and those in the education industry better cater to the needs of their students and make education more effective.


Boris Delovski

Data Science Trainer

Boris Delovski

Boris is a data science trainer and consultant who is passionate about sharing his knowledge with others.

Before Edlitera, Boris applied his skills in several industries, including neuroimaging and metallurgy, using data science and deep learning to analyze images.