arXiv 2410.23262

EMMA: End-to-End Multimodal Model for Autonomous Driving

By Jyh-Jing Hwang, Runsheng Xu, et al.

Published 2024-10-30

Citation lineage

Review the prior work and downstream research connected to this paper.

We introduce EMMA, an End-to-end Multimodal Model for Autonomous driving. Built upon a multi-modal large language model foundation like Gemini, EMMA directly maps raw camera sensor data into various driving-specific outputs, including planner trajectories, perception objects, and road graph elements. EMMA maximizes the utility of world knowledge from the pre-trained large language models, by representing all non-sen…

View the original paper on arXiv