arXiv 2311.12871
An Embodied Generalist Agent in 3D World
By Jiangyong Huang, Silong Yong, et al.
Published 2023-11-18
Discussion
Read the public discussion and references gathered around this paper.
Leveraging massive knowledge from large language models (LLMs), recent machine learning models show notable successes in general-purpose task solving in diverse domains such as computer vision and robotics. However, several significant challenges remain: (i) most of these models rely on 2D images yet exhibit a limited capacity for 3D input; (ii) these models rarely explore the tasks inherently defined in 3D world, eā¦