Social Interactive Human Video Synthesis

In this paper, we propose a computational model for social interaction between three people in a conversation, and demonstrate results using human video motion synthesis. We utilised semi-supervised computer vision techniques to label social signals between the people, like laughing, head nod and gaze direction. Data mining is used to deduce frequently occurring patterns of social signals between a speaker and a listener in both interested and not interested social scenarios, and the mined confidence values are used as conditional probabilities to animate social responses. The human video motion synthesis is done using an appearance model to learn a multivariate probability distribution, combined with a transition matrix to derive the likelihood of motion given a pose configuration. Our system uses social labels to more accurately define motion transitions and build a texture motion graph. Traditional motion synthesis algorithms are best suited to large human movements like walking and running, where motion variations are large and prominent. Our method focuses on generating more subtle human movement like head nods. The user can then control who speaks and the interest level of the individual listeners resulting in social interactive conversational agents.


[Image showing human video texture synthesis of conversational avatars. (A), (B) and (C) show different generated videos given the scenarios of an interested and not interested listener in the conversation. (D) demonstrates the diversity of the approach, allowing identical avatars to socially interact together.]

Dumebi Okwechime
Eng-Jon Ong
Andrew Gilbert
Richard Bowden


ACCV Cover
Social Interactive Human Video Synthesis
Okwechime, D. Ong, E-J. Gilbert, A. Bowden, R.
In Asian Conference on Computer Vision, ACCV10, Queenstown, New Zealand 2010.
pdf 1 MB] [bibtex]

ACCV2010 Oral Presentation Video
With Live Commentary (19:05 mins) - (Requires Quicktime to view)
[Video 31.5 MB]

ACCV2010 Supplementary Material
Includes Audio Commentary (2:06 mins)
[Video 10.5 MB]