Bias and Ambiguity in Facial Expression Datasets
This experimental research examines an open-source facial expression image dataset, analyzes it, and addresses questions about bias and ambiguity.
Is it possible for datasets to be unbiased? Do facial expression datasets simply reinforce existing assumptions about what different expressions mean?
Introduction
The research phase was iterative and based on experimenting with visual elements and examining their connections.
This exploratory research centered on the idea of working with visual corpora, investigating recurring motifs, and examining relations within collections of images. The project examined the layers of meaning in facial expressions and questioned the assumptions about what these images represent.
Details of the Dataset
The dataset I used is open-source and available on Hugging Face. It contains a total of 800 images, with each of the following emotion categories represented by 100 images:
- Happy
- Anger
- Sad
- Contempt
- Disgust
- Fear
- Surprise
- Neutral
Each image is labeled with one of these emotion categories. I selected this dataset because it appeared relatively diverse in terms of age, race, and gender representation, to explore potential biases and assumptions in facial expression data.
Process Overview
During this exploratory process, I conducted several analyses and experiments. The structure of the analysis is outlined on the side.
Each section is explained below, and
the results from one session will lead
to the next.
Computational Image Analysis
Average Image Analysis
"Images are made up of pixels, which are data points that represent the relationships between different parts of the image."
In my dataset, there are 800 images of the same size, each belonging to different individuals. I wanted to create an average image for each emotion category. To do this, I used Google Colab to run the code that overlaps the images.
The average image(on the right) reveals consistent facial features due to uniform image sizes, with eyes, nose, and mouth typically positioned in similar locations. Identifying gender in the images becomes challenging.
The "sad" expression appears older, while "surprise" resembles a younger person, emphasizing a closer examination of each image for dataset diversity.
It becomes difficult to assign a single label to the average images without their "original" labels.
Color Analysis
I ran color analysis code in Google Colab, generating graphs based on the mean RGB values of each image. The plot patterns reflect color similarities rather than structural or expressive features, with images of similar dominant colors (like skin tones, lighting, or backgrounds) clustering together.
The color analysis allowed me to visualize all the images in one graph and provided an opportunity to compare them side by side.
Analysis & Experiments
"Some basic human emotions (happiness, sadness, anger, fear, surprise, disgust and contempt) are innate and shared by everyone, and that they are accompanied across cultures by universal facial expressions."
Paul Ekman
Paul Ekman suggests that each basic emotion has a corresponding, universal facial expression. This means that regardless of cultural background, people are likely to display emotions in similar ways through facial cues.
However, analysing this this dataset showed me that facial expressions are inherently ambiguous, making it challenging to generalize them. Distinguishing between certain emotions, like fear and surprise, is particularly difficult because their facial cues are quite similar. Key features that contribute to these expressions include the eyes, eyebrows, and mouth, with the eyebrows playing a crucial role in conveying specific emotions. This raises an interesting question:
Which facial features do we focus on when interpreting an emotion?
CodeChart Experiment
To experiment on how people focus on facial features to interpret emotions, I used an open-source eye-tracking code to track participants' attention on images from my dataset. I selected one image for each emotion category and created an experiment in p5.js. The experiment flow is outlined below.
Experiment can be found here.
Analysis of the Experiment
The analysis showed a general tendency for participants to focus on the "left eye" across emotions and highlighted influences like background brightness on attention. However, it was difficult to draw concrete conclusions from the eye-tracking data, as my small sample size didn’t yield any significant implications.
While some emotions (like disgust, happiness, and sadness) were easier to identify, others remained ambiguous, underscoring the challenge of assigning a single, definitive emotion to an image.
GenAI Experiments
Since the data from my sample didn’t yield concrete results but highlighted the ambiguity in assigning emotions to specific facial expressions, I became curious to see how a GenAI model (ChatGPT-4) would interpret emotions in these images and which facial features it would consider most relevant. I ran a quick experiment, and here is one of the results.
Prompt: I’m going to upload some pictures, and I’d like you to tell me which emotions you see in them and which facial features led you to interpret them that way.
Main Arguments
If avoiding bias completely isn’t possible, how can we use it responsibly and for the benefit of humans? What is the line?
All knowledge can be data, however some of them are implicit. How can we make implicit knowledge clear and tangible?
Do we really need AI to recognize human emotions? What is the value of automating such personal and subjective interpretations?