arXiv 2511.14759

$π^{*}_{0.6}$: a VLA That Learns From Experience

By Physical Intelligence, Ali Amin, et al.

Published 2025-11-18

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL). We present a general-purpose method, RL with Experience and Corrections via Advantage-conditioned Policies (RECAP), that provides for RL training of VLAs via advantage conditioning. Our method incorporates heterogeneous data into the self-improvement process, including demonstrations, data fro…

View the original paper on arXiv