Developing preferential attention to a speaker : A robot learning to recognise its carer
In this paper we present a socially interactive multi-modal robotic head, ERWIN - Emotional Robot With Intelligent Networks, capable of emotion expression and interaction via speech and vision. The model presented shows how a robot can learn to attend to the voice of a specific speaker, providing a relevant emotional expressive response based on previous interactions. We show three aspects of the system; first, the learning phase, allowing the robot to learn faces and voices from interaction. Second, recognition of the learnt faces and voices, and third, the emotion expression aspect of the system. We show this from the perspective of an adult and child interacting and playing a small game, much like an infant and caregiver situation. We also discuss the importance of speaker recognition in terms of human-robot-interaction and emotion, showing how the interaction process between a participant and ERWIN can allow the robot to prefer to attend to that person.