arXiv 2502.14499
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
By Deepak Nathani, Lovish Madaan, et al.
Published 2025-02-20
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
We introduce Meta MLGym and MLGym-Bench, a new framework and benchmark for evaluating and developing LLM agents on AI research tasks. This is the first Gym environment for machine learning (ML) tasks, enabling research on reinforcement learning (RL) algorithms for training such agents. MLGym-bench consists of 13 diverse and open-ended AI research tasks from diverse domains such as computer vision, natural language pā¦