Computational Analysis of Facial Expressions

Shenoy, A.

View/Open

Author

Shenoy, A.

Abstract

This PhD work constitutes a series of inter-disciplinary studies that use biologically plausible computational techniques and experiments with human subjects in analyzing facial expressions. The performance of the computational models and human subjects in terms of accuracy and response time are analyzed. The computational models process images in three stages. This includes: Preprocessing, dimensionality reduction and Classification. The pre-processing of face expression images includes feature extraction and dimensionality reduction. Gabor filters are used for feature extraction as they are closest biologically plausible computational method. Various dimensionality reduction methods: Principal Component Analysis (PCA), Curvilinear Component Analysis (CCA) and Fisher Linear Discriminant (FLD) are used followed by the classification by Support Vector Machines (SVM) and Linear Discriminant Analysis (LDA). Six basic prototypical facial expressions that are universally accepted are used for the analysis. They are: angry, happy, fear, sad, surprise and disgust. The performance of the computational models in classifying each expression category is compared with that of the human subjects. The Effect size and Encoding face enable the discrimination of the areas of the face specific for a particular expression. The Effect size in particular emphasizes the areas of the face that are involved during the production of an expression. This concept of using Effect size on faces has not been reported previously in the literature and has shown very interesting results. The detailed PCA analysis showed the significant PCA components specific for each of the six basic prototypical expressions. An important observation from this analysis was that with Gabor filtering followed by non linear CCA for dimensionality reduction, the dataset vector size may be reduced to a very small number, in most cases it was just 5 components. The hypothesis that the average response time (RT) for the human subjects in classifying the different expressions is analogous to the distance measure of the data points from the classification hyper-plane was verified. This means the harder a facial expression is to classify by human subjects, the closer to the classifying hyper-plane of the classifier it is. A bi-variate correlation analysis of the distance measure and the average RT suggested a significant anti-correlation. The signal detection theory (SDT) or the d-prime determined how well the model or the human subjects were in making the classification of an expressive face from a neutral one. On comparison, human subjects are better in classifying surprise, disgust, fear, and sad expressions. The RAW computational model is better able to distinguish angry and happy expressions. To summarize, there seems to some similarities between the computational models and human subjects in the classification process.

Publication date

2010-03-22

Metadata

Show full item record

University of Hertfordshire Research Archive