arXiv 2512.14234

ViBES: A Conversational Agent with Behaviorally-Intelligent 3D Virtual Body

By Juze Zhang, Changan Chen, et al.

Published 2025-12-16

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Human communication is inherently multimodal and social: words, prosody, and body language jointly carry intent. Yet most prior systems model human behavior as a translation task co-speech gesture or text-to-motion that maps a fixed utterance to motion clips-without requiring agentic decision-making about when to move, what to do, or how to adapt across multi-turn dialogue. This leads to brittle timing, weak social…

View the original paper on arXiv