arXiv 2410.23262
EMMA: End-to-End Multimodal Model for Autonomous Driving
By Jyh-Jing Hwang, Runsheng Xu, et al.
Published 2024-10-30
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
We introduce EMMA, an End-to-end Multimodal Model for Autonomous driving. Built upon a multi-modal large language model foundation like Gemini, EMMA directly maps raw camera sensor data into various driving-specific outputs, including planner trajectories, perception objects, and road graph elements. EMMA maximizes the utility of world knowledge from the pre-trained large language models, by representing all non-sen…