Analyzing Input and Output Representations for Speech-Driven Gesture Generation

Taras Kucherenko; Dai Hasegawa; Gustav Eje Henter; Naoshi Kaneko; Hedvig Kjellström

doi:10.1145/3308532.3329472

What is it about?

We present a machine-learning model that can generate 3D gesture movements based on speech. It is using a chain of several neural networks and performs better than the baseline.

Photo by Vidar Nordli-Mathisen on Unsplash

Why is it important?

Human communication is to a large extend non-verbal. While talking, people spontaneously gesticulate, which plays a key role in conveying information. If we want interaction with social agents (such as robots or virtual avatars) to be natural and smooth we need to enable them to gesticulate as well.

This page is a summary of: Analyzing Input and Output Representations for Speech-Driven Gesture Generation, July 2019, ACM (Association for Computing Machinery),
DOI: 10.1145/3308532.3329472.
You can read the full text:

Read

Contributors

The following have contributed to this page

Taras Kucherenko
KTH Royal Institute of Technology

Developing a model to generate hand gestures with a better representation of human poses.

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Developing a model to generate hand gestures with a better representation of human poses.

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management