arXiv 2311.12871

An Embodied Generalist Agent in 3D World

By Jiangyong Huang, Silong Yong, et al.

Published 2023-11-18

Citation lineage

Review the prior work and downstream research connected to this paper.

Leveraging massive knowledge from large language models (LLMs), recent machine learning models show notable successes in general-purpose task solving in diverse domains such as computer vision and robotics. However, several significant challenges remain: (i) most of these models rely on 2D images yet exhibit a limited capacity for 3D input; (ii) these models rarely explore the tasks inherently defined in 3D world, e…

View the original paper on arXiv