arXiv 2510.07077

Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications

By Kento Kawaharazuka, Jihoon Oh, et al.

Published 2025-10-08

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Amid growing efforts to leverage advances in large language models (LLMs) and vision-language models (VLMs) for robotics, Vision-Language-Action (VLA) models have recently gained significant attention. By unifying vision, language, and action data at scale, which have traditionally been studied separately, VLA models aim to learn policies that generalise across diverse tasks, objects, embodiments, and environments.…

View the original paper on arXiv