arXiv 2211.09808

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

By Hao Li, Jinguo Zhu, et al.

Published 2022-11-17

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Despite the remarkable success of foundation models, their task-specific fine-tuning paradigm makes them inconsistent with the goal of general perception modeling. The key to eliminating this inconsistency is to use generalist models for general task modeling. However, existing attempts at generalist models are inadequate in both versatility and performance. In this paper, we propose Uni-Perceiver v2, which is the f…

View the original paper on arXiv