arXiv 2512.14234

ViBES: A Conversational Agent with Behaviorally-Intelligent 3D Virtual Body

By Juze Zhang, Changan Chen, et al.

Published 2025-12-16

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Human communication is inherently multimodal and social: words, prosody, and body language jointly carry intent. Yet most prior systems model human behavior as a translation task co-speech gesture or text-to-motion that maps a fixed utterance to motion clips-without requiring agentic decision-making about when to move, what to do, or how to adapt across multi-turn dialogue. This leads to brittle timing, weak social…

View the original paper on arXiv