arXiv 2208.04464

In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation

By Bolin Lai, Miao Liu, et al.

Published 2022-08-08

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

In this paper, we present the first transformer-based model to address the challenging problem of egocentric gaze estimation. We observe that the connection between the global scene context and local visual information is vital for localizing the gaze fixation from egocentric video frames. To this end, we design the transformer encoder to embed the global context as one additional visual token and further propose a…

View the original paper on arXiv