arXiv 2506.15953
ViTacFormer: Learning Cross-Modal Representation for Visuo-Tactile Dexterous Manipulation
By Liang Heng, Haoran Geng, et al.
Published 2025-06-19
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Dexterous manipulation is a cornerstone capability for robotic systems aiming to interact with the physical world in a human-like manner. Although vision-based methods have advanced rapidly, tactile sensing remains crucial for fine-grained control, particularly in unstructured or visually occluded settings. We present ViTacFormer, a representation-learning approach that couples a cross-attention encoder to fuse high…