we proposed third-UMI: Wrist-Cam-Free Robot Manipulation via Third-Person Point Clouds and Kinematic Retargeting
Using third-UMI, we efficiently collect real-world data and deploy it to physical robots, enabling scalable data collection with minimal hardware for visuomotor policy learning.
Third-UMI employs a third-view camera without wrist cameras or electronic grippers. At only 380 g, it is lightweight and enables 4.6× faster data collection than 3D-mouse teleoperation.