arXiv 2502.14499
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
By Deepak Nathani, Lovish Madaan, et al.
Published 2025-02-20
Citation lineage
Review the prior work and downstream research connected to this paper.
We introduce Meta MLGym and MLGym-Bench, a new framework and benchmark for evaluating and developing LLM agents on AI research tasks. This is the first Gym environment for machine learning (ML) tasks, enabling research on reinforcement learning (RL) algorithms for training such agents. MLGym-bench consists of 13 diverse and open-ended AI research tasks from diverse domains such as computer vision, natural language pā¦